Thoughts on the Economics of the AWS Cloud

A couple days ago Amazon published a short whitepaper on the considerations one might make when building a data center vs. using Amazon’s web services. In my experience most people use EC2 and S3 for their app, and the more adventurous people branch out into Amazon’s other services. Most people need storage and compute power, fewer people need Elastic Map Reduce.

Financially, for a small scale production or testing platform using the Amazon Web Services makes a lot of sense for a company whose core business is not IT-centric. Running a couple m1.small EC2 instances and paying for a backup to S3 or EBS is cheaper than purchasing a box and running the app in colo.

Since the start of the AWS platform I had the opinion that, as one starts using more and more EC2 instances and S3 storage space, there comes a point where it would be cheaper to bring the IT platform in house and hire a staff to run it instead of running on AWS forever. I have no idea where that point is, nor do I know of anyone with credibility who does. I thought the dollar amount where in-housing is cheaper is quite high.

I say those things in the past tense. At the risk of being accused of falling victim to Amazon propaganda, after reading the Economics of the AWS Cloud paper I think the expense of bringing a production app in-house for financial reasons alone is too great to overcome the benefits of using Amazon web services. What Amazon provides in terms of reliability, redundancy, management and scale can not be overcome unless you plan to compete with Amazon. This completely ignores in-housing due to compliance and auditing. AWS is not an option in that case.

The paper makes the argument that we can hit closer to 100% server utilization than an owned IT infrastructure. Though I’m not sure I buy this argument, idle CPU and unused memory are idle wherever they are, using a pay-as-you-go service implies someone might be more aware of hardware utilization because using more costs more in EC2. I guess the comparison is owning 10 boxes at 5% utilization vs. renting 2 EC2 instances at 60% utilization. Over 5 years renting EC2 instances at full price is cheaper than the capital expense of buying those 10 boxes.

The Power Efficiency and Enabling Redundancy sections of the paper are particularly important. Running a data center costs more than just the price of the computers in it. Power and cooling isn’t free. Rack space isn’t free. The investments Amazon can afford to make in improving power efficiency and data center design is much greater than most other data center owners can afford. If you can afford to make this investment you’re already operating at Amazon-scale – why aren’t you competing with them?

Based on the go-big-or-go-home data center, who should build data centers? CDNs, hosting providers, colo facilities and business that are subject to regulation? Is that it? As someone who has been using AWS and other hosting providers for years for small customers and doing hardware recommendations for existing, poorly designed, non-regulated company data centers for large customers, I’m partial to the AWS/hosted approach if there is not already hardware that can do the job on site. Hardware procurement in large companies is a nightmare and hasn’t changed since I started doing this.

So where are we headed with AWS? Are new tech startups going to forego hardware purchases and use AWS as their infrastructure? In the last year I have worked with a lot of people making that choice. What about existing small businesses? I haven’t done a lot of work where people are migrating away from their colo’d software and I don’t expect them to, at least until the next upgrade cycle. Upgrade cycle – that’s another thing Amazon handles “in the cloud” for us. Certainly a small business needs to be aware of the AWS option and I don’t have a lot of experience with the IT companies that small businesses out-source to.

Of course I expect (non-regulated) big business to continue building poorly designed data centers based on use-it-or-lose-it budgets. “Our team has 30 boxes sitting idle and we need to order 10 more!”

These were just some of my thoughts while reading the paper. Leave a comment with any feedback or your own thoughts on the Economics of the AWS Cloud paper.

Wiring RESTful web services with Spring

I’ve recently been using Spring 3.0M1 to make some RESTful APIs. While working on the project a question came up a few times from different people: how do the <jee:jndi-lookup/>, <context:annotation-config/> and <context:component-scan/> beans relate to each other? I’ve got some sample code that I use as a reference when this question comes up.

Let’s start with the <jee:jndi-lookup/> bean. Since a RESTful API probably runs in a web container we need to look up a data source, persistence unit or persistence context. We define the persistence unit in WEB-INF/classes/META-INF/persistence.xml.

<?xml version="1.0"?>
<persistence xmlns="http://java.sun.com/xml/ns/persistence" version="1.0">
  <persistence-unit name="bookmarksPU" transaction-type="RESOURCE_LOCAL">
    <provider>org.apache.openjpa.persistence.PersistenceProviderImpl</provider>
    <non-jta-data-source>jdbc/bookmarks</non-jta-data-source>
    <class>net.anthonychaves.bookmarks.models.User</class>
    <class>net.anthonychaves.bookmarks.models.Bookmark</class>
  </persistence-unit>
</persistence>

The bookmarksPU persistence unit is registered with the container when the webapp is deployed. It looks up the jdbc/bookmarks datasource and uses it for access to the database. Setting up this datasource is up to you based on your container.

The bookmarksPU is also bound to a JNDI name, persistence/bookmarksPU. We can inject this persistence unit into our webapp in the form of an EntityManagerFactory after registering a bean in our Spring container. We do this in our Spring configuration file by looking up the persistence unit in JNDI.

<jee:jndi-lookup id="bookmarksPU" jndi-name="persistence/bookmarksPU"/>

In our service layer we have a class UserService.

@Service
public class UserService {

  @PersistenceUnit(unitName="bookmarksPU")
  EntityManagerFactory emf;

  public void saveUser(User user) {
    EntityManager em = emf.createEntityManager();
    em.getTransaction().begin();
    em.persist(user);
    em.getTransaction().commit();
  }

  public User findUser(String name) {
    EntityManager em = emf.createEntityManager();
    Query query = em.createQuery("select u from User u where u.name = ?1")
                    .setParameter(1, name);
    em.getTransaction().begin();
    User user = (User) query.getSingleResult();
    em.getTransaction().rollback();
    return user;
  }
}

Notice two things about this class: it is annotated with the org.springframework.stereotype.Service annotation and its EntityManagerFactory is annotated with the javax.persistence.PersistenceUnit annotation. We want Spring to deal with both of these annotations. We want the component, the class annotated by @Service, registered as a Spring bean without doing it manually in XML. We also want Spring to inject an instance of EntityManagerFactory bound to the persistence/bookmarksPU persistence unit.

One of the nice things about Spring is that we can rely on these annotations so we don’t have to create long XML configuration files. We still need the XML config to tell Spring which classes it should examine for these annotations. Enter the <context:annotation-config/> and <context:component-scan/> beans.

Each of these beans implicitly creates instances of classes that implement the BeanPostProcessor interface. The post processor created by the <context:annotation-config/> bean that is most important to our case is the PersistenceAnnotationPostProcessorBean. This bean finds the bookmarksPU bean created when we performed the JNDI lookup and injects it into the emf field annotated with @PersistenceUnit.

The <context:annotation-config/> bean takes care of the javax.persistence annotation but it does not create a bean definition for the UserService class. To do that we need the <context:component-scan/> bean. This bean scans the specified base package for classes annotated with @Component and its subclasses, which includes @Service. By annotating the UserClass as a @Service we are spared from configuring it by hand in XML.

The <context:component-scan/> bean only searches for annotations relevant to the current application context. In this case we must define the service layer outside of the *-servlet.xml WebApplicationContext. The @Controller annotation is valid in a WebApplicationContext where a @Service annotation is not. We should instead keep our service layer configuration separate from the webapp configuration. In services.xml we have our services defined.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:p="http://www.springframework.org/schema/p"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:jee="http://www.springframework.org/schema/jee"
       xsi:schemaLocation="http://www.springframework.org/schema/beans

http://www.springframework.org/schema/beans/spring-beans-3.0.xsd

http://www.springframework.org/schema/context

http://www.springframework.org/schema/context/spring-context-3.0.xsd

http://www.springframework.org/schema/jee

                           http://www.springframework.org/schema/jee/spring-jee-2.5.xsd">

  <jee:jndi-lookup id="bookmarksPU" jndi-name="persistence/bookmarksPU"/>

  <!-- here we need component-scan for the services and annotation-config for the persistence unit -->
  <context:component-scan base-package="net.anthonychaves.bookmarks"/>
  <context:annotation-config/>

</beans>

A WebApplicationContext inherits all the bean definitions from any imported ApplicationContexts. If we import the services.xml file in our *-servlet.xml file we will inherit the fully wired UserService bean created by the BeanPostProcessor beans in that ApplicationContext. We have a controller that needs an instance of UserService.

@Controller
@RequestMapping("/user")
public class UserController {

	@Autowired
	UserService userService;

	@Autowired
	ImageCaptchaService icservice;

	@RequestMapping(value="/new", method=RequestMethod.GET)
	public String newUser(ModelMap model) {
		model.addAttribute(new User());
		return "user_new";
	}

	@RequestMapping(method=RequestMethod.POST)
	public String createUser(@ModelAttribute("user") User user,
							 HttpSession session,
							 @RequestParam("j_captcha_response") String captchaResponse) {

		boolean validResponse = icservice.validateResponseForID(session.getId(), captchaResponse);
		if (validResponse) {
		  userService.saveUser(user);
		  session.setAttribute("user", user);
			return "redirect:/b/user";
		} else {
			return "redirect:/b/user/new";
		}
	}

	@RequestMapping(method=RequestMethod.GET)
	public String user(HttpSession session) {
	  if (session == null || session.getAttribute("user") == null) {
	    return "redirect:/b/user/new";
	  }

		User user = (User)session.getAttribute("user");
	  return "user";
	}

	@RequestMapping(method=RequestMethod.POST, value="/login")
	public String login(@RequestParam("name") String username, HttpSession session) {
	  User user = userService.findUser(username);
	  session.setAttribute("user", user);
    return "redirect:/b/user";
	}

	public void setUserService(UserService userService) {
	  this.userService = userService;
	}
}

By specifying the UserService in UserController as @Autowired we expect an AutowiredAnnotationBeanPostProcessor to inject an instance of UserService into an instance of this class. Both the <context:annotation-config/> and <context:component-scan/> beans create instances of AutowiredAnnotationBeanPostProcessor. The <context:annotation-config/> bean creates a PersistenceAnnotationBeanPostProcessor which is not important to us here. We don’t have any persistence units to inject.

The <context:component-scan/> bean creates an AutowiredAnnotationBeanPostProcessor and it also registers a bean definition for the UserController. @Controller is a subclass of @Component and has several annotations that would only be used in a class annotated with @Controller. The <context:component-scan/> bean takes care of mapping any routes and parameters specified by the @RequestMapping, @RequestParam and other @Controller-related annotations.

Our bookmarks-servlet.xml file needs only the <context:component-scan/> bean defined to get our web classes up and running. The full bookmarks-servlet.xml file is very simple.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:p="http://www.springframework.org/schema/p"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:jee="http://www.springframework.org/schema/jee"
       xsi:schemaLocation="http://www.springframework.org/schema/beans

http://www.springframework.org/schema/beans/spring-beans-3.0.xsd

http://www.springframework.org/schema/context

http://www.springframework.org/schema/context/spring-context-3.0.xsd

http://www.springframework.org/schema/jee

                           http://www.springframework.org/schema/jee/spring-jee-2.5.xsd">

  <import resource="services.xml"/>

  <!-- component-scan creates implicit AutowiredBeanPostProcessor,
             we don't need PersistenceAnnotationPostProcessor -->
  <context:component-scan base-package="net.anthonychaves.bookmarks"/>

  <bean id="viewResolver" class="org.springframework.web.servlet.view.UrlBasedViewResolver">
    <property name="viewClass" value="org.springframework.web.servlet.view.JstlView"/>
    <property name="prefix" value="/WEB-INF/jsp/"/>
    <property name="suffix" value=".jsp"/>
  </bean>

  <bean id="captchaService" class="net.anthonychaves.bookmarks.service.CaptchaServiceSingleton"
        factory-method="getInstance"/>

</beans>

This should hopefully make it very easy for anyone to quickly create at least a skeleton RESTful API with Spring. Comments? Let me know!

Getting started with Mechanize

I think this will start a short series on black-box webapp testing.  This is just the first in a series and we’ll add quite a bit of content to it in the next week or so.

A weeks back I helped a development team set up a testing environment for their Ruby on Rails webapp.  The webapp is about 18 months old and had exactly zero tests.  Reverifying its intended behavior was a full time job for some of the developers on the team because there was no way to prove any behavior worked as intended at any given time.  Worse, behavior verification was a manual process and obviously error-prone.
(more…)

Nati Shalom to speak at Boston Scalability Group next Wednesday

There is an interesting BostonSUG meeting next week at the IBM Innovation center in Waltham, MA.  GigaSpaces CTO Nati Shalom will speak on cloud-based infrastructure, space-based architecture, scalability with latency in mind and more.  BostonSUG web site for more details: http://www.bostonsug.org/2009/02/09/nati-shalom-speaking-on-february-18/

I’ll be there!

Dummy Code (Quick’n'Dirty vs. Engineered)

When creating software, two people will never write the same implementation of a method or system of non-trivial design.  Creating software is a problem solving process and there are usually many ways to solve one problem.  The solutions may differ in elegance and efficiency while giving the same output for a given set of inputs.  A correct solutions is a correct solution regardless of implementation. (more…)