Monday, December 28, 2009

Building Security Systems

Being software developer for over eighteen years, I have observed a number of recurring problems and one of those recurring problems is security system. Most systems you build will require some kind of security so in this post I will go over core concepts when adding security to your system.


User Registration


A pre-requisite for any security system is to allow users to register to the system and store those users in some database, LDAP, Active Directory, or storage system. Though, for an internal application this step may be unnecessary.


Authentication


The authentication allows systems to validate users based on password or other form of verification. For internal applications within a company, users may have to use multiple applications with their own authentication and each external website would also require unique authentication. This quickly becomes burdensome for both users and applications as users have to remember the passwords and systems have to maintain them. Thus, many companies employ some form of Single-Sign-On and I have used many solutions such as SiteMinder, IChain, Kerberos, Open SSO, Central Authentication Service (CAS), or other home built solutions. These Single-Sign-On systems use reverse proxy servers that sit in front of the application and intercepts all requests and automatically redirects users to login page if the users are not authenticated. When an internal system consists of multiple tiers such as services, it is often required to pass authentication tokens to those services. In J2EE systems, you can Common Secure Interoperability (CSIv2) protocol to pass the authentication to other tiers, which uses Security Attribute Service (SAS) protocol to perform client authentication and impersonation.




For external systems, Open ID is a way to go and I have used RPX to integrate Open ID for a number of sites I have developed such as http://wazil.com/, http://dealredhot.com/, etc.



There are a number of factors that make authentication a bit tricky such as when part of your system does not require authentication, you have to ensure the authentication policy is being used correctly. Also, in general authentication requires https instead of http, so you have to ensure that the site use those protocols consistently. In generaly, static contents such as css, javascript and images do not require authentication but often they are also put behind authentication by mistake.




Another factor related to authentication is session management. A session determines how long the user can access the system without login. Though, many systems provide remember-me feature, but often sessions require system resources on the server. It’s essential to keep the session short as it can effect scalability if it’s stored on the server. I generally prefer keeping the session very short and storing only user-id and a couple of other database-ids such as shopping-cart-id, request-id, etc. If they are short, they can also be stored in cookies that makes a stateless system so you can scale easily.


Authorization


Not all users are same in most systems, thus authorization allows you to provide access control to limit the usage based on permissions and access control. There are a number of ways to define authorization such as Access control list, Role-based access control, Capability-based security, etc. In most systems, I have used J2EE/EJB Security, Java Web Security, JAAS, Acegi, which is now part of Spring and home built systems. As security is a cross cutting concern, I prefer to define those declaratively in a common security file or with annotations. There is nothing worse than sporadic security code mixed with your business logic.




One of feature I have found lacked in most of open source and commercial tools is support for instance based security or dynamic security that verifies runtime properties. For example, in most RBAC systems you can define rule that a purchase order can be approved by a role “POApprover”, but it does not allow you to say that “POApprover” can only approve if the user is from the same department or if amount is less than $10,000, etc.


UI or Resource Protection


When users have various level of access, it is essential to hide the UI elements and resources that are not accessible. Though, I have seen some systems employ security by obscurity that only hide the resources without actually enforcing the permissions, but it’s a bad idea. This can be complicated when the access level is very fine grained such as when a single form has fields based on role and permissions.


Database Security


The security must be enforced in depth, ranging from the UI, business and database tier. The database operations must use security to prevent access to unauthorized data. For example, let’s assume a user can post and edit blogs, it is essential that the database only allows the user to modify his/her blog. Also, it is critical that any kind of sensitive data such as passwords or personal identification with encryption. This is another reason I like OpenId or SSO solution because you don’t need to maintain them.


Method/Message Security


The message security ensures that a user only invokes the operations that he/she is authorized. For example, Acegi provides an annotation based mechanism to protect unauthorized methods.



Data Integrity


Any communication based systems may need to use message authentication check (MAC) to detect changes to the data.


Confidentiality


Any communication based systems may need to encrypt sensitive data with HTTPS.


Non-repudiation


The system must audit users action so that they cannot repudiate them.


Summary



As achieving high level of security can be difficult and expensive so you need to treat security as a risk and employ the level of security that suits the underlying system. Finally, as I have found most RBAC systems lack, I have started my own open source project PlexRBAC to provide instance based security. Of course if you hare interested in assisting with the effort, you are welcome to join the project.

Monday, December 14, 2009

Dynamic Inheritance and Composition using Object Extension Pattern

Static Inheritance


Inheritance is a core feature of object oriented languages that has been used to simulate real world by modeling closely related objects and to build reusable code. The inheritance relationship is defined statically in class specifications and it comes in various flavors such as:


Single Inheritance


It allows a class to be extended by just one other class.


Multiple Inheritance


It allows a class to be derived from multiple classes and historically has been difficult to maintain and has been source of diamond inheritance in C++, though other languages use order such as Method Resolution Order (MRO) in Python to avoid those issues.



Interfaces


The interfaces are used in C# and Java to define methods without implementation and a class can implement multiple interfaces without the downsides of multiple inheritance.


Mixins


The mixins are available in Ruby and D, that use mixins for code reuse. The mixins are similar to interfaces with implementations except they aggregate methods and attributes at runtime.


Traits


The traits are available in Squeak and Scala and are conceptually similar to Mixins except traits do not allow attributes.



Dynamic Inheritance


As opposed to static inheritance, dynamic inheritance can be added at runtime using Object Extension Pattern, which I first learned in Erich Gamma, et al’s Gof patterns. In late 90s, I used Voyager ORB for building distributed systems, which used this pattern. Following example shows how this pattern can be used:



Let’s define a marker interface Extension in Java such as:


1 package ext;
2

3 public interface Extension {
4
5 }
6
7



Then create a factory class such as


1 package ext;
2
3 public class ExtensionsFactory {
4 public void register(final Class subject, final Extension ... exts) {/* ... */}
5 public <T> T get(final Object subject, final Class<T> extClass) { /* ... */ return ;}
6 }
7

8

The subject is object that needs to extend extensions, e.g. let’s assume you have a User class and you need to add hobbies, you can do it as follows:


1 package domain;
2
3 public class User {
4 //...

5 }
6
7

And you then define Hobbies as follows:


1 package domain;
2

3 public class Hobbies implements ext.Extension {
4 public Hobbies(User user) {
5 // ...

6 }
7 }
8
9

At runtime, you can register Hobbies to User and use it as follows



1 package test;
2
3 public class Main {
4 public static void main(String[] args) {
5 ExtensionsFactory f = new ExtensionsFactory();
6

7 f.register(User.class, Hobbies.class);
8
9 //

10 User user = new User();
11 Hobbies hobbies = f.get(user, Hobbies.class);
12 }
13

14 }
15
16

The dynamic inheritance allows you to follow open-closed principle by extending classes without modifying existing classes and allows you to choose features that you need at runtime. Of course, dynamic languages such as Ruby make this a lot easier as you can extend classes or objects with modules at runtime, e.g.


1 ### defining Hobbies extension

2 module Hobbies
3 def hobbies
4 end

5 end
6
7 ### defining User class

8 class User
9 end
10

11 user = User.new.extend(Hobbies)
12
13 puts user.singleton_methods #["hobbies"]

14
15 ## or
16 ### binding Hobbies with User at runtime
17 class << User

18 include Hobbies
19 end
20 puts User.singleton_methods # ["hobbies"]

21
22
23

In real life, the inheritance relationship can be difficult to get right and often you have to use Liskov Substitution Principle to ensure base class can be replaced by derived class in all uses of the base class. However, dynamic inheritance acts more like Composition feature so above technique can also be used to implement dynamic composition. The dynamic inheritance or composition allows you to mix and match features you need at runtime and build extendable systems. This technique has been success key of evolution of Eclipse IDE. Also, this technique goes nicely with the Adaptive Object Modeling technique I described in my last post to build easily extendable systems.

Monday, November 16, 2009

Applying Adaptive Object Model using dynamic languages and schema-less databases



Introduction to Adaptive/Active Object Model



Adaptive or Active Object Model is a design pattern used in domains that requires dynamic manipulation of meta information.
Though, it is quite extensive topic of research, but general idea from original paper of
Ralph Johnson is to treat meta information such as attributes,
rules and relationships as a data. It is usually used when the number of sub-classes is huge or unknown upfront and the system requires adding new functionality without downtime.
For example, let's say we are working in automobile domain and we need to model different type of vehicles. Using an object oriented design would result in vehicle hierarchy such as follows:




In above example, all type hierarchy is predefined and each class within the hierarchy defines attributes and operations. Adaptive Object Modeling on the other hand use Object Type pattern, which treats classes like objects. The basic Adaptive Object Model uses type square model such as:





In above diagram, EntityType class represents all classes and instance of this class defines actual attributes and operations supported by the class. Similarly, PropertyType defines names and types of all attributes. Finally, instance of Entity class will actual be real object instance that would store collection of properties and would refer to the EntityType.

Java Implementation


Let's assume we only need to model Vehicle class from above vehicle hierarchy. In a typical object oriented language such as Java, the Vehicle class would be defined as follows:


1 /*
2 * Simple Vehicle class

3 *
4 */

5 package com.plexobject.aom;

6
7 import java.util.Date;
8

9 public class Vehicle {

10
11 private String maker;
12 private String model;

13 private Date yearCreated;

14 private double speed;
15 private long miles;

16 //... other attributes, accessors, setters

17
18 public void drive() {
19 //

20 }

21
22 public void stop() {
23 //

24 }

25
26 public void performMaintenance() {
27 //
28 }

29 //... other methods

30 }
31
32
33


As you can see all attributes and operations are defined within the Vehicle class. The Adaptive Object Model would use meta classes such as Entity, EntityType, Property and PropertyType to build the Vehicle metaclass. Following Java code defines core classes of type square model:


The Property class defines type and value for each attribute of class:



1 /*
2 * Property class defines attribute type and value
3 *

4 */

5 package com.plexobject.aom;
6
7 public class Property {


8
9 private PropertyType propertyType;
10 private Object value;
11
12 public Property(PropertyType propertyType, Object value) {


13 this.propertyType = propertyType;
14 this.value = value;
15 }
16
17 public PropertyType getPropertyType() {


18 return propertyType;
19 }
20
21 public Object getValue() {
22 return value;


23 }
24 //... other methods
25 }
26
27


The PropertyType class defines type information for each attribute of class:


1 /*
2 * PropertyType class defines type information
3 *

4 */

5 package com.plexobject.aom;

6
7 public class PropertyType {

8
9 private String propertyName;

10 private String type;
11
12 public PropertyType(String propertyName, String type) {

13 this.propertyName = propertyName;
14 this.type = type;

15 }
16
17 public String getPropertyName() {

18 return propertyName;
19 }

20
21 public String getType() {
22 return type;

23 }
24 //... other methods

25 }



The EntityType class defines type of entity:


1 /*
2 * EntityType class defines attribute types and operations
3 *

4 */
5 package com.plexobject.aom;

6
7 import java.util.Collection;

8 import java.util.HashMap;
9 import java.util.Map;

10
11 public class EntityType {

12
13 private String typeName;
14 private Map<String, PropertyType> propertyTypes = new HashMap<String, PropertyType>();


15 private Map<String, Operation> operations = new HashMap<String, Operation>();
16
17 public EntityType(String typeName) {


18 this.typeName = typeName;
19 }
20
21 public String getTypeName() {
22 return typeName;


23 }
24
25 public void addPropertyType(PropertyType propertyType) {
26 propertyTypes.put(propertyType.getPropertyName(),
27 propertyType);


28 }
29
30 public Collection<PropertyType> getPropertyTypes() {
31 return propertyTypes.values();


32 }
33
34 public PropertyType getPropertyType(String propertyName) {
35 return propertyTypes.get(propertyName);
36 }


37
38 public void addOperation(String operationName, Operation operation) {
39 operations.put(operationName, operation);
40
41 }


42
43 public Operation getOperation(String name) {
44 return operations.get(name);
45 }
46


47 public Collection<Operation> getOperations() {
48 return operations.values();
49 }
50 //... other methods

51 }
52
53



The Entity class defines entity itself:


1 /*

2 * Entity class represents instance of actual metaclass
3 *
4 */

5 package com.plexobject.aom;

6
7 import java.util.Collection;
8 import java.util.Collections;

9

10 public class Entity {
11
12 private EntityType entityType;

13 private Collection<Property> properties;

14
15 public Entity(EntityType entityType) {
16 this.entityType = entityType;

17 }
18
19 public EntityType getEntityType() {

20 return entityType;
21 }

22
23 public void addProperty(Property property) {

24 properties.add(property);
25 }
26

27 public Collection<Property> getProperties() {
28 return Collections.unmodifiableCollection(properties);

29 }
30

31 public Object perform(String operationName, Object[] args) {
32 return entityType.getOperation(operationName).perform(this, args);

33 }
34 //... other methods

35 }



The Operation interface is used for implementing behavior using Command pattern:



1 /*
2 * Operation interface defines behavior

3 *
4 */
5 package com.plexobject.aom;


6
7 public interface Operation {
8
9 Object perform(Entity entity, Object[] args);


10 }



Above meta classes would be used to create classes and objects. For example, the type information of Vehicle class would be defined in EntityType and PropertyType and the instance would be defined using Entity and Property classes as follows. Though, in real applications, type binding would be stored in XML configuration or will be defined in some DSL, but I am binding programmatically below:


1 /*
2 * an example of binding attributes and operations of Vehicle

3 *
4 */

5 package com.plexobject.aom;

6
7 import java.util.Date;
8

9
10 public class Initializer {

11
12 public void bind() {

13 EntityType vehicleType = new EntityType("Vehicle");

14 vehicleType.addPropertyType(new PropertyType("maker",
15 "java.lang.String"));

16 vehicleType.addPropertyType(new PropertyType("model",

17 "java.lang.String"));
18 vehicleType.addPropertyType(new PropertyType("yearCreated",

19 "java.util.Date"));

20 vehicleType.addPropertyType(new PropertyType("speed",
21 "java.lang.Double"));
22 vehicleType.addPropertyType(new PropertyType("miles",


23 "java.lang.Long"));
24 vehicleType.addOperation("drive", new Operation() {
25
26 public Object perform(Entity entity, Object[] args) {


27 return "driving";
28 }
29 });
30 vehicleType.addOperation("stop", new Operation() {


31
32 public Object perform(Entity entity, Object[] args) {
33 return "stoping";
34 }
35 });


36 vehicleType.addOperation("performMaintenance", new VehicleMaintenanceOperation());
37
38
39 // now creating instance of Vehicle
40 Entity vehicle = new Entity(vehicleType);


41 vehicle.addProperty(new Property(vehicleType.getPropertyType("maker"),
42 "Toyota"));
43 vehicle.addProperty(new Property(vehicleType.getPropertyType("model"),


44 "Highlander"));
45 vehicle.addProperty(new Property(vehicleType.getPropertyType("yearCreated"),
46 new Date(2003, 0, 1)));


47 vehicle.addProperty(new Property(vehicleType.getPropertyType("speed"), new Double(120)));
48 vehicle.addProperty(new Property(vehicleType.getPropertyType("miles"), new Long(3000)));


49 vehicle.perform(
50 "drive", null);
51
52 }
53 }


54
55



The operations define runtime behavior of the class and can be defined as closures (anonymous classes) or external implementation such as VehicleMaintenanceOperation as follows:


1 /*

2 * an example of operation
3 *
4 */

5 package com.plexobject.aom;
6
7 class VehicleMaintenanceOperation implements Operation {


8
9 public VehicleMaintenanceOperation() {
10 }
11
12 public Object perform(Entity entity, Object[] args) {


13 return "maintenance";
14 }
15 }
16
17






In real applications, you would also have meta classes for business rules, relationships, strategies, validations, etc as instances. As, you can see AOM provides powerful way to adopt new business requirements and I have seen it used successfully while working as
consultant. On the downside, it requires a lot of plumbing and tooling support such as XML based configurations or GUI tools to manipulate meta data. I have also found it difficult to optimize with relational databases as each attribute and operation are stored in separate rows in the databases, which results in excessive joins when building the object. There are a number of alternatives of Adaptive Object
Model such as code generators, generative techniques, metamodeling, and table-driven systems. These techniques are much easier with dynamic languages due to their support of metaprogramming, higher order functions and generative programming. Also, over the last
few years, a number of schema less databases such as CouchDB, MongoDB, Redis, Cassendra, Tokyo Cabinet, Riak, etc. have become popular due to their ease of use and scalability. These new databases solve excessive join limitation of relational databases and allow evolution of applications similar to Adaptive Object Model. They are also much more scalable than traditional databases.
The combination of dynamic languages and schema less databases provides a simple way to add Adaptive Object Model features without a lot of plumbing code.


Javascript Implementation


Let's try above example in Javascript due to its supports of higher order functions, and prototype based inheritance capabilities. First, we will need to add some helper methods to Javascript (adopted from Douglas Crockford's "Javascript: The Good Parts"), e.g.

1
2 if (typeof Object.beget !== 'function') {


3 Object.beget = function(o) {
4 var F = function() {};
5 F.prototype = o;


6 return new F();
7 }
8 }
9


10 Function.prototype.method = function (name, func) {
11 this.prototype[name] = func;
12 return this;
13 };


14
15
16 Function.method('new', function() {
17 // creating new object that inherits from constructor's prototype

18 var that = Object.beget(this.prototype);
19 // invoke the constructor, binding -this- to new object

20 var other = this.apply(that, arguments);
21 // if its return value isn't an object substitute the new object

22 return (typeof other === 'object' && other) || that;
23 });
24


25 Function.method('inherits', function(Parent) {
26 this.prototype = new Parent();
27 return this;


28 });
29
30 Function.method('bind', function(that) {
31 var method = this;


32 var slice = Array.prototype.slice;
33 var args = slice.apply(arguments, [1]);
34 return function() {
35 return method.apply(that, args.concat(slice.apply(arguments,


36 [0])));
37 };
38 });
39
40 // as typeof is broken in Javascript, trying to get type from the constructor

41 Object.prototype.typeName = function() {
42 return typeof(this) === 'object' ? this.constructor.toString().split(/[\s\(]/)[1] : typeof(this);


43 };
44
45

There is no need to define Operation interface, Property and PropertyType due to higher order function and dynamic language support. Following Javascript code defines core functionality of Entity and EntityType classes, e.g.:

1
2 var EntityType = function(typeName, propertyNamesAndTypes) {


3 this.typeName = typeName;
4 this.propertyNamesAndTypes = propertyNamesAndTypes;
5 this.getPropertyTypesAndNames = function() {


6 return this.propertyNamesAndTypes;
7 };
8 this.getPropertyType = function(propertyName) {


9 return this.propertyNamesAndTypes[propertyName];
10 };
11 this.getTypeName = function() {
12 return this.typeName;


13 };
14 var that = this;
15 for (propertyTypesAndName in propertyNamesAndTypes) {


16 that[propertyTypesAndName] = function(name) {
17 return function() {
18 return propertyNamesAndTypes[name];


19 };
20 }(propertyTypesAndName);
21
22 }

23 };
24
25

26
27 var Entity = function(entityType, properties) {
28 this.entityType = entityType;

29 this.properties = properties;

30 this.getEntityType = function() {
31 return this.entityType;
32 };

33 var that = this;

34 for (propertyTypesAndName in entityType.getPropertyTypesAndNames()) {
35 that[propertyTypesAndName] = function(name) {

36 return function() {

37 if (arguments.length == 0) {
38 return that.properties[name];
39 } else {

40 var oldValue = that.properties[name];

41 that.properties[name] = arguments[0];
42 return oldValue;
43 }
44 };

45 }(propertyTypesAndName);

46
47 }
48 };



Following Javascript code shows binding and example of usage (again in real application binding will be stored in configurations):

1

2 var vehicleType = new EntityType('Vehicle', {

3 'maker' : 'String', // name -> typeName

4 'model' : 'String',

5 'yearCreated' : 'Date',
6 'speed' : 'Number',

7 'miles' : 'Number'

8 });
9
10 var vehicle = new Entity(vehicleType, {

11 'maker' : 'Toyota',

12 'model' : 'Highlander',
13 'yearCreated' : new Date(2003, 0, 1),

14 'speed' : 120,

15 'miles' : 3000
16 });
17
18 vehicle.drive = function() {

19 }.bind(vehicle);

20
21 vehicle.stop = function() {
22 }.bind(vehicle);
23
24 vehicle.performMaintenance = function() {


25 }.bind(vehicle);


As you can see, a lot of plumbing code disappears with dynamic languages.

Ruby Implementation


Similarly, above example in Ruby may look like:

1 require 'date'

2 require 'forwardable'
3 class EntityType
4 attr_accessor :type_name

5 attr_accessor :property_names_and_types
6 def initialize(type_name, property_names_and_types)
7 @type_name = type_name


8 @property_names_and_types = property_names_and_types
9 end
10 def property_type(property_name)
11 @property_names_and_types[property_name]


12 end
13 end
14
15
16 class Entity

17 attr_accessor :entity_type
18 attr_accessor :properties
19 def initialize(entity_type, attrs = {})


20 @entity_type = entity_type
21 bind_properties(entity_type.property_names_and_types)
22 attrs.each do |name, value|
23 instance_variable_set("@#{name}", value)


24 end
25 end
26 def bind_properties(property_names_and_types)
27 (class << self; self; end).module_eval do

28 property_names_and_types.each do |name, type|
29 define_method name.to_sym do
30 instance_variables_get("@#{name}")


31 end
32 define_method name.to_sym do
33 instance_variables_set("@#{name}", value)


34 end
35 end
36 end
37 end
38 end

39
66
67
68

We can then use Singleton, Lambdas and metaprogramming features of Ruby to add Adaptive Object Model support, e.g.

1 vehicle_type = EntityType.new('Vehicle', {


2 'maker' => 'String', # class.name
3 'model' => 'String',


4 'yearCreated' => 'Time',
5 'speed' => 'Fixnum',


6 'miles' => 'Float'});
7
8
9 vehicle = Entity.new(vehicle_type, {


10 'maker' => 'Toyota',
11 'model' => 'Highlander',
12 'yearCreated' => DateTime.parse('1-1-2003'),


13 'speed' => 120,
14 'miles' => 3000});
15 class << vehicle


16 def drive
17 "driving"
18 end
19 def stop


20 "stopping"
21 end
22 def perform_maintenance
23 "performing maintenance"

24 end
25 end
26
27


As you can see Ruby makes it even more succint and provides a lot more options for higher order functions such as monkey patching, lambdas/procs/methods, send, delegates/forwardables, etc.



Schema-less Databases


Now, the second half of the equation for Adaptive Object Model is persisting, which I have found to be challenge with relational databases. However, as I have been using schemaless databases such as CouchDB, it makes it trivial to store meta information as part of the plain data. For example, if I have to store this vehicle in CouchDB, all I have to do is create a table such as vehicles (I could use Single Table Inheritance to store all types of vehicles in same table):

curl -XPUT http://localhost:5984/vehicles
curl -XPUT http://localhost:5984/vehicle_types

and then add vehicle-type as

curl -XPOST http://localhost:5984/vehicle_types/ -d '{"maker":"String", "model":"String", "yearCreated":"Date", "speed":"Number", "miles":"Number"}'

which returns

{"ok":true,"id":"bb70f95e43c3786f72cb46b372a2808f","rev":"1-3976038079"}

Now, we can use the id of vehicle-type and add vehicle a follows

curl -XPOST http://localhost:5984/vehicles/ -d '{"vehicle_type_id":"bb70f95e43c3786f72cb46b372a2808f", "maker":"Toyota", "model":"Highlander", "yearCreated":"2003", "speed":120, "miles":3000}'


which returns id of newly created vehicle as follows:

{"ok":true,"id":"259237d7c041c405f0671d6774bfa57a","rev":"1-367618940"}


Summary


Adaptive Object Model based on dynamic languages with support metaprogramming and generative programming provide powerful techniques to meet increasingly changing requirements and it can be used build systems that can be easily evolved with minimum changes and downtime. Also, Schema-less databases eliminates drawbacks of many implementations of AOM that suffer from poor performance due to excessive joins in the relational databases.

Friday, July 31, 2009

Day 1 at #oscon 2009

The first day of OSCon 2009 covered a number of tutorials and I decided to attend Google App Engine tutorial for the first half of the day. Google App Engine API follows CGI model of web development, i.e., it uses stdin and stdout files and assumes stateless applications. There is limit of 10MB response size and 30 requests per second, but it does not allow streaming. The tutorial started pretty slow and we spent first hour just installing the SDK and tutorial. The Google App Engine SDK is available from http://code.google.com/appengine/downloads.html. I downloaded Mac image and then dragged the image to my hard drive. I then double clicked the app icon for Google Appe Engine SDK, which installed the SDK under /usr/local/google_appengine. Once the SDK is installed, you have to install Google App Engine tutorials from http://code.google.com/p/app-engine-tutorial/.



After installing SDK and tutorial, I copied all files named tutorial? under the SDK. The rest of session covered those tutorials one by one, though we ran out of time in the end and completed only upto tutorial7. In order to install first tutorial, I went into tutorial1 directory, e.g.

cd /usr/local/google_appengine/tutorial1

Then started local app server as follows:
python ../dev_appserver.py .


When I pointed my browser to the http://localhost:8080, I was able to see “Hello World!”.


Next, I registered myself to http://appspot.com. After registering, I received an SMS message for confirmation and was able to fully register after entering the confirmation number. Next, I created an application-id on Google App Engine. You can only create 10 app-ids and you cannot delete app-ids, so be careful with ids. Also, you can also use your own domain instead appspot.com. For my testing purpose, I chose the id “shahbhat”.



Next, I changed app.yaml inside my local tutorial1 directory that describes how your application is configured. You may also notice index.yaml, which describes list of indices in the database, though Google App Engine can figure out what queries are being used and creates indices automatically. I changed application name in app.yaml to “shahbhat”, e.g.

application: shahbhat

I then pushed my application to the Google App Engine by typing
python ../appcfg.py update .

I was then able to go to http://shahbhat.appspot.com/ and see my application, Voila. You can also see your application usage from http://appengine.google.com/dashboard?app_id=shahbhat (you will have to change app_id parameter in your application).



Unfortunately, a lot of people had problems getting to that state so we wasted another half hour in break where other folks sort out configuration and deployment issues.
Next, I went through another tutorial to turn on authentication by setting:

login: required

in app.yaml file. Next I added caching by adding expires options in the app.yaml. I was also able to use curl to test my applications and see headers to verify caching, e.g.
curl --include http://localhost:8080

Which showed following when caching was not configured:
Cache-Control: no-cache

When I configured the caching to 2d, I was able to see:

Cache-Control: public, max-age=172800



The Google App Engine SDK also includes development that you can view by going to:
http://localhost:8080/_ah/admin


The Google App Engine supports Django based templates, e.g.
#!/usr/bin/env python

import os
from google.appengine.ext.webapp import template

def main():
template_values = {"foo" : [1,2,3]}
template_file = os.path.join(
os.path.dirname(__file__), “index.html”)
body = template.render(
template_file, template_values)
print “Status: 200 OK”
print “Content-type: text/html”
print
print body

if __name__ == ‘__main__’:
main()


In addition, Google App Engine supports WSGI standard (PEP 333), e.g.

import os
import wsgiref.handlers

from google.appengine.ext import webapp
from google.appengine.ext.webapp import template

class IndexHandler(webapp.RequestHandler):

def get(self):
template_values = {"foo" : 1}

template_file = os.path.join(os.path.dirname(__file__), "index.html")
self.response.out.write(template.render(template_file, template_values))


def main():
application = webapp.WSGIApplication([('/', IndexHandler)], debug=True)
wsgiref.handlers.CGIHandler().run(application)


if __name__ == ‘__main__’:
main()

Other tutorials included authentication APIs such as
create_login_url(dest_url)
create_logout_url(dest_url)
get_current_user()
is_current_user_admin()



The SDK also includes decorator to add authentication automitcally using
from gogole.appengine.ext.webapp.util import login_required
...
@login_required
def get(self):


Finally, we went over datastore APIs for persistence support, e.g.
import os
import wsgiref.handlers

from google.appengine.ext import webapp
from google.appengine.ext.webapp import template
from google.appengine.ext import db

class ToDoModel(db.Model):
description = db.StringProperty()
created = db.DateTimeProperty(auto_now_add=True)
foo = db.FloatProperty(default=3.14)
bar = db.IntegerProperty()
baz = db.BooleanProperty(default=False)
N = db.IntegerProperty()
l = db.ListProperty(str, default=["foo", "bar"])


class IndexHandler(webapp.RequestHandler):

def get(self):
todo = ToDoModel(description = “Hello World”, bar=1, baz=True)
todo.put()

def main():
application = webapp.WSGIApplication([('/', IndexHandler)], debug=True)
wsgiref.handlers.CGIHandler().run(application)

if __name__ == ‘__main__’:
main()


You can view the model by going to http://localhost:8080/_ah/admin/datastore. The data store supports a number of types such as string, boolean, blob, list, time, text. However, there are some limitations,
e.g. TextProperty can only store upto 500 bytes and Google App Engine will create index if needed, however it won’t create index on TextProperty. For each row, the datastore assigns a numeric id and UUID based key,
though you can provide your own key. Also, a row cannot exceed 1MB.
Unfortunately, we ran out of time At this time, so I had to go to http://code.googlecom/appengne/docs for further documentation. Overall, I thought it was good introduction to Google App Engine, but I was disappointed that instructor wasted a lot of time with setup that could have been used to cover rest of the tutorials.




For the second half of the day, I attended session on “Building applications with XMPP”. This was interesting session that showed usage of XMPP for a number of usecases such as IM, gaming, real-time social networking, , monitoring, etc. The session started with history of XMPP (Extensible Messaging and Presnce Protocol), its Jabber roots and its use of streaming XML. A number of factors contributed to the popularity of XMPP such as open source, XML, federeated network, low latency, etc. The XMPP is also very extensible and supports audio/video via jingle, geo location, TLS, SASL, etc. XMPP architecture is based on client server, where servers are decentralized and federated. XMPP identifies a user with jabber id that looks like email address and consists of local part, domain and resource, e.g. alise@wonderland.lit/TeaParty, where domain is mandatory, but local-part and resource are optional. There are three types of XMPP message stanzas, i.e., presence, IQ, message. The presence-stanza is asynchronous, but IQ stanza requires response. As opposed to web architecture that is based on short lived connections and stateless architecture, XMPP uses one long lived session and events are received asynchronously.



Next, the tutorial showed how to build a javascript client for jabber using sleekxmpp and Strophe library (alternatively you can use twistedword). The example used Bosh protocol to wrap XMPP protocol with HTTP protocol. Unfortunately, there was a lot of fast typing and I could not follow all that. I am waiting for the presenters to post the slides online so that I can use those examples in my own applications.



Blog copy:
Day 1 at #oscon 2009

Thursday, July 30, 2009

Cut the scope and make your life easy

I have been developing software for over twenty years and in every project you have to grapple with iron triangle of schedule/cost/functionality or sometime referred to as cost/quality/schedule or cost/resources/schedule. In my experience, curtailing the scope produces better results than adding more resources or extending deadline. In addition, slashing the scope also produces other side effects such as reducing the complexity of the software, easier learning curve for users, less training/support cost and better communication among team members.


You can reduce the scope by focusing on essential features using Pareto principle (80-20 rule) and companies like like Apple or 37Signals produce great products that are not only more useful but are much simpler to use. However, this is not easy as project manager or product owner have to say NO. Too often, I see project managers say YES to anything to please upper management and users. In the end, the team is overwhelmed and under stress. Also, a big pile of features where all features are of same importance (priority) is biggest reason for death-march projects.



Working with a small number of features reduces complexity such as essential complexity, cyclomatic complexity or accidental complexity because your codebase is smaller. Though, you still have to apply good software engineering principles such as domain driven design, unit testing, refactoring, etc, but maintenance becomes easier with smaller codebase. When you have a small codebase you have fewer bugs as they are no bugs for zero code. Fewer bugs means less support cost when some user complains of a bug or when system crashes in the middle of the night.


With a small set of features, the user interface becomes simpler, which in turn provides better usability to the users. Often, I have seen users get confuse when they have to work with a complex software that has a lot of features. This often is remedied by providing training or adding support that adds a lot more overhead to the projects. Again, better user interface does not come free automatically with a small set of features, but the usability problem becomes easier with fewer features.


Finally, small number of features and small code means your team size will remain small so communication among team members becomes easier. I like to work with team with size of 5 plus/minus 2, as number of communication links increase exponentially when you add more members. Also, with smaller teams that are colocated, you have better
Osmotic communication that Alistair Cockburn talks about. At Amazon, we have “2-Pizza” teams, i.e., teams are small enough to have team lunch with just two pizzas. Another factor when building teams is whether they are cross functional (vertical) or focus on single expertise such as systems, database, UI, etc. I prefer working with cross functional teams that focus on a single service or an application as communication and priorities within a single team is much easier to manage than between different teams.


In nutshell, reducing scope not only helps you deliver the software in time and delight your users but prepares you better to maintain and support the software. The complexity is number one killer for the software and results in buggy and bloated software. You should watch out when someone says “Wouldn’t it be cool if it did X?” kind of feature requests and often I see developers see this as a challenge or an opportunity to learn or apply new technology. However, each new feature takes a toll on your existing features, software maintenance and your team.




My other weblog: Cut the scope and make your life easy