A Developers Guide to Amazon SimpleDB (Developers Library)

Free download. Book file PDF easily for everyone and every device. You can download and read online A Developers Guide to Amazon SimpleDB (Developers Library) file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with A Developers Guide to Amazon SimpleDB (Developers Library) book. Happy reading A Developers Guide to Amazon SimpleDB (Developers Library) Bookeveryone. Download file Free Book PDF A Developers Guide to Amazon SimpleDB (Developers Library) at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF A Developers Guide to Amazon SimpleDB (Developers Library) Pocket Guide.

I walked to school everyday in the snow. Recommended reading. Company data powered by. Mocky Habeeb. I solve problems in the world Problem solving must be applied at all levels. I mainly do this using Java and web technologies. Technical Skills. Likes: java clojure c lisp Dislikes:. Written 44 answers. Active in java, amazon-web-services and nosql. Leverage Coherence events and continuous queries to provide real-time updates to client applications Please check www.

Get hands-on experience with eXtreme Scale APIs, and understand the different approaches to using data grids 2. Introduction to new design patterns for both eXtreme Scale and data grids in general 3. Tutorial-style guide through the major data grid features and libraries 4. Analyze and learn different high availability options, including clustering and replication solutions within MySQL 2. Tune your MySQL database for optimal performance. The only complete, practical, book of MySQL high availability techniques and tools on the market Please check www.

No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. ISBN www. Change presents a new challenge, a new paradigm, and new technologies to learn. To realize this, all you have to do is look at the evolution of computers. During the 70s, we worked in a world of mainframes and raised floors. Only special people got to touch the computer, while others had to be content watching from outside of the fishbowl.

The 80s brought the mini-computer with dedicated CRT terminals. You could show data on the screen in any color as long as it was green, but the computer was down the hall in the back room. The 80s also introduced the personal computer. The 90s saw the advent of the Internet, and people dialed in, and in the early s, the Internet went viral. As high-speed connections became common, the Internet replaced corporate networks. Computers went from rooms to luggables to "in my briefcase" to "in my pocket.

Selecting a brand and model of server computer is being replaced with renting a virtual server at a hosting service like Amazon. The purchaser of these virtual servers doesn't have to select a hardware "brand. All I am buying is cycles and reliability. This move to virtual servers also changes the capital required to propose the next viral application. I don't need to buy a large database cluster, hoping for the acceptance to fill it.

I am billed for usage, not capacity. SimpleDB is one of those virtual offerings and the topic of this book. He hangs out on Twitter as pchaganti. He has built a career on breaking new ground in the computer field. CARES was the first computer system in the world for aging missing children. CARES has been internationally recognized as pioneering work in child aging. Rich has also created several generations of e-learning platforms including Learn it script and most recently Educate Press. Rich is a seasoned software developer with over 30 years of experience.

He is currently leading the effort to build a standards-based web-scale Application server. He is a visiting faculty member with IIIT-Hyderabad for a course on middleware and also speaks at various technology conferences. Anders Samuelsson has over 25 years of experience in the computing industry. The main focus during this time has been with computer security. He currently works for Amazon. I'd like to thank my wife Malena and my son Daniel and daughter Ida, for always standing by me and allowing me to spend time helping out with this book.

I love you forever. He lives near Atlanta with his wife and four children. I would like to dedicate this book to my brother Madhukar, who gave us all a big scare, and with typical panache came out of it stronger than ever, my sister-in-law Meghna for putting the rock of Gibraltar to shame and showing us all how to handle and deal with adversity, and my nephew Yuv, the two year old fire cracker. My two daughters Anika and Anya were understanding and patient beyond their years as I stuck to my Mac at all kinds of weird hours.

Above all, this book would not have made it into the station without the constant support, love and encouragement from my lovely wife Nitika! How is SimpleDB priced? But in order to use SimpleDB, you really have to change your mindset. This isn't a traditional relational database; in fact it's not relational at all.

For developers who have experience working with relational databases, this may lead to misconceptions as to how SimpleDB works. This practical book aims to address your preconceptions on how SimpleDB will work for you. You will be led quickly through the differences between relational databases and SimpleDB, and the implications of using SimpleDB. Throughout this book, there is an emphasis on demonstrating key concepts with practical examples for Java, PHP, and Python developers.

You will be introduced to this massively scalable schema less key-value data store: what it is, how it works, and why it is such a game changer. You will then explore the basic functionality offered by SimpleDB including querying, code samples, and a lot more. This book will help you deploy services outside the Amazon cloud and access them from any web host. You will see how SimpleDB gives you the freedom to focus on application development.

As you work through this book you will be able to optimize the performance of your applications using parallel operations, caching with memcache, asynchronous operations, and more.

It also illustrates several SimpleDB operations using these libraries. Chapter 4, The SimpleDB Data Model, takes a detailed look at the SimpleDB data model and different methods for interacting with a domain, its items, and their attributes. It further talks about the domain metadata and reviews the various constraints imposed by SimpleDB on domains, items, and attributes.

Chapter 5, Data Types, discusses the techniques needed for storing different data types in SimpleDB, and explores a technique for storing numbers, Boolean values, and dates. It also teaches you about XML-restricted characters and encoding them using base64 encoding. Chapter 6, Querying, describes the Select syntax for retrieving results from SimpleDB, and looks at the various operators and how to create predicates that allow you to get back the information you need. It practically modifies a sample domain to add additional metadata including a file key that is again used for naming the MP3 file uploaded to S3.

The example used in this chapter shows you a simple way to store metadata on SimpleDB while storing associated content that is in the form of binary files on Amazon S3.

Chapter 8, Tuning and Usage Costs, mainly covers the BoxUsage of different SimpleDB queries and the usage costs, along with viewing the usage activity reports. Chapter 10, Parallel Processing, analyzes how to utilize multiple threads for running parallel operations against SimpleDB in Java, PHP, and Python in order to speed up processing times and taking advantage of the excellent support for concurrency in SimpleDB.

You do not need to know anything about SimpleDB to read and learn from this book, and no basic knowledge is strictly necessary. This guide will help you to start from scratch and build advanced applications. Conventions In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "clicking the Next button moves you to the next screen".

Tips and tricks appear like this. Reader feedback Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of. Customer support Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase. The downloadable files contain instructions on how to use them. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us.

By doing so, you can save other readers from frustration, and help us to improve subsequent versions of this book. Once your errata are verified, your submission will be accepted and the errata added to any list of existing errata.

Amazon SimpleDB: LITE

Piracy Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or web site name immediately so that we can pursue a remedy.

We appreciate your help in protecting our authors, and our ability to bring you valuable content. So why would you use a database that has none of these capabilities? The answer is scalability. This morning, CNN ran a story on your new web application. Yesterday you had 10 concurrent users, and now your site is viral with 50, users signing on. Which database will handle 50, concurrent users without a complex expensive cluster? The answer is SimpleDB. Why SimpleDB? You can scale it easily in response to increased load from your successful applications without the need for a costly cluster database server complex.

Getting to Know SimpleDB SimpleDB, as illustrated in the following diagram, is designed to be used either as an independent data storage component in your applications or in conjunction with some of the other services from Amazon's stable of Cloud Services, such as Amazon S3 and Amazon EC2. The biggest challenge in SimpleDB is learning to think in its unique metaphor. Like speaking a new language, you need to stop translating and start thinking in that language. You can get started anytime you like, and you pay for it based on how much you use it. It is very different from a relational database, and takes a completely different approach toward storing and querying data.

It follows the convention of eventual consistency. Think of it as a single master database for updates and a large collection of read database slaves. Any changes made to your data will need to be propagated across all the different copies. This can sometimes take a few seconds depending upon the system load at that time and network latency, which means that a consumer of your domain and data may not see the changes immediately. The changes will eventually be propagated throughout SimpleDB, but this is an important consideration you need to think about when designing your application.

In this book, we cover several basic sample applications. To run any sample, an Amazon account is required. The best way to wrap your head around the way SimpleDB works is to picture a spreadsheet that contains your structured data. For instance, a contact database that stores information on your customers can be represented in SimpleDB as follows: As SimpleDB is a different database metaphor, new terms have been introduced. The use of this new terminology by Amazon stresses that the traditional assumptions may not be valid.

Domain The entire customers table will be represented as the domain Customers. Domains group similar data for your application, and you can have up to domains per AWS account. If required, you can increase this limit further by filling out a form on the SimpleDB website. The data stored in these domains is retrieved by making queries against the specific domain. There is no concept of joins as in the relational database world; therefore, queries run within a specific domain and not across domains.

Item Each customer is represented by a unique Customer ID. Items are similar to rows in a database table. Each item identifies a single object and contains data for that individual item as a number of key-value attributes. Each item is identified by a unique key or identifier, or in traditional terminology, the primary key. SimpleDB does not support the concept of auto-incrementing keys, and most people use a generated key such as the unix timestamp combined with the user identifier or something similar as the unique identifier for an item.

You can have up to one billion items in each domain. A customer will have a name, a phone number, an address, and other such attributes, which are similar to the columns in a table in a database.

NoSQL: Guides, Tutorials, Books, Papers • myNoSQL

SimpleDB even enables you to have different attributes for each item in a domain. This kind of schema independence lets you mix and match items within a domain to satisfy the needs of your application easily, while at the same time enables you to take advantage of the benefits of the automatic indexing provided by SimpleDB. If your company suddenly decides to start marketing using Twitter, you can simply add a new attribute to your customer domain for the customers who have a Twitter ID!

In traditional database terminology, there is no need to add a new column to the table. Values Each customer attribute will be associated with a value, which is the same as a cell in a spreadsheet or the value of a column in a database. A relational database or a spreadsheet supports only a single value for each cell or column, while SimpleDB allows you to have multiple values for a single attribute. This lets you do things such as store multiple e-mail addresses for a customer while taking advantage of automatic indexing, without the need for you to manually create new and separate columns for each e-mail address, and then index each new column.

In a relational DB, a separate table with a join would be used to store the multiple values. Unlike a delimited list in a character field, the multiple values are indexed, enabling quick searching. It is a simple way of modeling your data, but at the same time, it is different from a relational database model that is familiar to most users. There are several libraries available in different programming languages that encapsulate this entire process and make it even easier to interact with SimpleDB by removing some of the tedium of manually constructing the HTTP requests.

The next chapter explores these libraries and the advantages provided by them. The following diagram illustrates the different components of SimpleDB and the operations that can be used for interacting with them: How is SimpleDB priced? Amazon provides a free tier for SimpleDB along with pricing for usage above the free tier limit. The charges are based on the machine utilization of each SimpleDB request along with the amount of machine capacity that is utilized for completing the specified request normalized to the hourly capacity of a 1.

This is a significant amount of usage being provided for free for this limited time period by Amazon, and there are many kinds of applications that can operate entirely within this free tier. While a credit card is required to sign up, usage can be checked at any time with the Amazon Account Activity web page. The pricing details might make it a bit daunting to figure out what your costs may be initially, but the free tier provided by Amazon goes a long way toward getting you more comfortable using the service and also putting SimpleDB through its paces without significant cost.

There is also a nice calculator provided on the AWS site that is very helpful for computing the monthly usage costs for SimpleDB and the other Amazon web services. Why should I use SimpleDB? You now have an overview of the service, and you are reasonably familiar with what SimpleDB can do. It is a great piece of technology that enables you to create scalable applications that are capable of using massive amounts of data, and you can put this power and simplicity to use in your own applications.

Make your applications simpler to architect You can leverage SimpleDB in your applications to quickly add, edit, and retrieve data using a simple set of API calls. The well-thought-out API and simplicity of usage will make your applications easier to design, architect, and maintain in the long run, while removing the burdens of data modeling, index maintenance, and performance tuning. You expand your data set as you go and add the new attributes only when they are absolutely needed.

SimpleDB enables you to do this easily, and even the indexing for these newly-added attributes is automatically handled behind the scenes without any need for your intervention. Create high-performance web applications High-performance web applications need the ability to store and retrieve data in a fast and efficient way. Amazon SimpleDB provides your applications with this ability while removing a lot of the administrative and maintenance complexities, leaving you free to focus on what's important to you—your application. Take advantage of lower costs You pay only for the SimpleDB resources that you actually consume, and you no longer need to lay out significant expenditures up front for database software licenses or even hardware.

The capacity planning and handling of any spikes in load and traffic are automatically handled by Amazon, freeing valuable resources that can be deployed in other areas. SimpleDB pricing passes on to you the cost savings achieved by Amazon's economies of scale. Scale your applications on demand And last but most importantly, you can easily handle traffic and load spikes on your applications, as SimpleDB will be doing all of the heavy lifting and scaling for you.

You can even handle the massive and tsunami-like increases in traffic that can result from being mentioned on the front page of Yahoo or Digg, or becoming a trendy topic on Twitter. This enables you to take full advantage of these other services and offload data processing and file storage needs to the cloud, while still using SimpleDB for your structured data storage needs.

Web-scale computing for your application needs along with cost-effectiveness is easier thanks to these cloud services. In the next chapter, we will start interacting with SimpleDB, and getting familiar with creating and modifying datasets utilizing one of the widely available SimpleDB software libraries.

You can download and install the samples on your PHP5 server so that you can try the samples as you read about them. Getting Started with SimpleDB You can sign up either by using your e-mail address for an existing Amazon account, or by creating a completely new account. You may wish to have multiple accounts to separate billing for projects. This could make it easier for you to track billing for separate accounts. Log in to your AWS account. Provide the requested credit card information and complete the signup process.

The request messages sent through either of these interfaces is digitally signed by the sending user in order to ensure that the messages have not been tampered within transit, and that they really originate from the sending user. An initial set of keys is automatically generated for you by default. You can regenerate the Secret Access Key at any time if you like. Keep in mind that when you generate a new access key, all requests made using the old key will be rejected.

There are no default certificates generated automatically for you by AWS. You must generate the certificate by clicking on the Create a new Certificate link, then download them to your computer and make them available to the machine that will be making requests to AWS. You can either upload your own x. The ReST Requests need to be authenticated in order to establish that they are originating from a valid SimpleDB user, and also for accounting and billing purposes. This authentication is performed using your access key identifiers.

If the signatures are different, the request is discarded, and AWS returns an error response. If the timestamp is older than 15 minutes, the request is rejected. The procedure for constructing your requests is simple, but tedious and time consuming. This overview was intended to make you familiar with the entire process, but don't worry—you will not need to go through this laborious process every single time that you interact with SimpleDB. Instead, we will be leveraging one of the available libraries for communicating with SimpleDB, which encapsulates a lot of the repetitive stuff for us and makes it simple to dive straight into playing with and exploring SimpleDB!

Most of these libraries provide support for all of the basic operations of SimpleDB. However, Amazon has been working hard to enhance and improve the functionality of SimpleDB, and as a result, they add new features frequently. You will want to leverage these new features as quickly as possible in your own applications. It is important that you select a library that has an active development cycle, so the new features are available fairly quickly after Amazon has released them. Another important consideration is the community around each library.

An active community that uses the library ensures good quality and also provides a great way to get your questions answered. In our experience, this library is a bit too verbose and requires a lot of boilerplate code. It is actively maintained and has a large community of users. This is a comprehensive library that provides access to all of the SimpleDB features. These are all great libraries, and they will be useful to you if your application is written in one of these languages. As you go through the sample code, you can then view the results in the database.

This is invaluable in both viewing results as well as updating or deleting data. One of Firefox's key features is the ability to install plugins to expand the capabilities. Then click on the Click here to install link. Firefox will ask for a confirmation to install the plugin. There is also a checkbox that sets if the tool can delete a domain.

If you are working on a production database, it is wise to leave this unchecked. A connection to your SimpleDB database will open in a new browser tab. The list of available domains will be listed in the domain area. This area is used to display SQL query results. Sample outline — performing basic operations In this book, each sample set will begin with a sample outline.

The sample goals, as well as common SimpleDB principles, will be introduced. The purpose of this sample is to introduce code snippets to create, list, and delete domains as well as create, query, and delete items.

Each domain is a container for storing items. Any item that does not have any attributes is considered empty and is automatically deleted by SimpleDB. You can therefore have empty domains stored in SimpleDB, but not items with zero attributes. This is an important consideration, and you need to be aware of it when storing and querying different data types, such as numbers or dates. You must convert their data into an appropriate string format, so that your queries against the data return expected results.

The conversion of data adds a little bit of extra work on your application side, but it also provides you with the flexibility to enforce data ReSTrictions at your application layer without the need for the data store to enforce the constraints. We will explore data types and their conversions to the appropriate string format in detail in Chapter 5, Data Types.

Basic operations with Java Java is a very popular language used for building enterprise applications. In this section we will download typica and then use it for exploring SimpleDB. The latest version of typica at the time of writing this is 1. Download the ZIP file from the website. Unzip to the folder of your choice and add the typica.

Here is the skeleton of a Java class named ExploreSdb that contains a main method. We will add code to the main method, and you can run the class to see it in action from the console or in the IDE of your choice. This will be our connection to Amazon SimpleDB and will be used for interacting with it. Typica lets you store the keys in a file named aws. In this chapter, we will use the explicit way. In each of the sections below, we will add snippets of code to the main method. You create a domain by calling the createDomain method and specifying a name for the domain.

This will return a list of domain objects. Once you delete a domain, the data is gone forever. So use caution when deleting a domain! This program is easy to understand, use, and expand. Rich Helms has expanded the API and provided samples for this book. Note that the user interface in these samples is very basic. The focus is on illustrating the SimpleDB interface. The entire API was in one file and simple to understand.

All programs are complete and can be run unaltered on your server. The samples use PHP 5. The menu index. When a program is run, the source is shown below in a box. As you go through the programs, use the Firefox SDBtool plugin to examine the database and see the results. The Key and Secret Key values are stored in two session variables. Program config.

Top Authors

If you are downloading and running the source from your site, you can just define the keys and avoid the session variables. Using session variables enables you to try the code at my location and still talk to your SimpleDB without me having access to your keys. File sdb. Once the connection to SimpleDB is made, to create a domain, call the createDomain function passing the domain name. The function returns an array of values. Retrieving a list of all our domains will return an array of domain names. This sample also only deals with a single value for each attribute.

Once the connection is made, three variables are prepared. An array is built with attribute names and values. Then the three variables are passed to the putAttributes function. Deleting a domain with PHP Finally, you delete a domain by specifying its name. Python is an elegant, open source, object-oriented programming language that is great for rapid application development. Python is a stable, mature language that has been around for quite a long period of time, and is widely used across many of the industries and in a large variety of applications.

It comes with an interactive console that can be used for quick evaluation of code snippets and makes experimentation with new APIs very easy. Python is a dynamically-typed language that gives you the power to program in a compact and concise manner. There is no such verbosity that is associated with a statically-typed language such as Java. It will be much easier to grasp the concepts of SimpleDB without drowning in a lot of lines of repetitive code. Most importantly, Python will bring fun back into your programming!

It was originally conceived by Mitch Garnaat and is currently maintained and enhanced by him and a community of developers. It is by far Prabhakar's favorite library for interacting with AWS, and is very easy to use. Boto works with most recent versions of Python, but please make sure that you are using at least a 2.

Do not use Python 3. There are installers available for Windows and Mac OS X, and the installation process is as simple as downloading the correct file and then double-clicking on the file. If you have Python already installed, you can easily verify the version from a terminal window.

If you are on Windows, just run the downloaded EXE file. If you are running on Linux, use your existing package manager to install it. For instance, on Ubuntu, you can install setuptools using the apt package manager. The latest version at the time of writing this chapter is boto Now change into this new folder and run the install script to install boto.

Before you use boto, you must set up your environment so that boto can find your AWS Access key identifiers. Set up two environment variables to point to each of the keys. If you don't have any errors, then you have boto installed correctly. This will quickly get you familiar with both boto and various SimpleDB operations. Boto will use the environment variable for the Access Keys that we set up in the previous section for connecting to SimpleDB. We first create a connection to SimpleDB. You can also explicitly specify the Access Keys on creation.

A new domain can be created by specifying a name for the domain. The name of an item must be unique and is similar to the concept of a primary key in a relational database. The uniqueness of the item name within a domain will cause your existing item attributes to be overwritten with the new values if you try to store new attributes with the same item name. We explored several SimpleDB operations using these libraries. In the next chapter, we will examine the differences between SimpleDB and the relational database model in detail. These relational databases are ubiquitous and are available from a wide range of companies such as Oracle, Microsoft, IBM, and so on.

These databases have served us well for our application needs. However, there is a new breed of applications coming to the forefront in the current Internet-driven and socially networked economy. The new applications require large scaling to meet demand peaks that can quickly reach massive levels. This is a scenario that is hard to satisfy using a traditional relational database, as it is impossible to requisition and provision the hardware and software resources that will be needed to service the demand peaks.

The overwhelming complexity of doing this makes the RDBMS not viable for these kinds of applications. However, in order to provide this solution, SimpleDB makes some choices and design decisions that you need to understand in order to make an informed choice about the data storage for your application domain. SimpleDB versus RDBMS No normalization Normalization is a process of organizing data efficiently in a relational database by eliminating redundant data, while at the same time ensuring that the data dependencies make sense. SimpleDB data models do not conform to any of the normalization forms, and tend to be completely de-normalized.

The lack of need for normalization in SimpleDB allows you a great deal of flexibility with your model, and enables you to use the power of multi-valued attributes in your data. In this example, we will create a simple contact database, with contact information as raw data. The table is inefficient and would require care to update to keep the name data in sync. To find a person by his or her phone number is easy. Searching for phone numbers by name would be ugly if the names got out of sync.

To improve the design, we can rationalize the data. One approach would be to create multiple phone number fields such as the following. While this is a simple solution, it does limit the phone numbers to three. Add e-mail and Twitter, and the table becomes wider and wider. SCORE—Rationalize data Strength Efficient storage Yes Efficient search by phone number Weakness No Efficient search by name Yes Easy to add another phone number No The design is simple, but the phone numbers are limited to three, and searching by phone number involves three index searches.

Do this with a small table and no one will notice, but try this on a large database with millions of records, and the performance of the database will suffer. The normalization for relational databases results in splitting up your data into separate tables that are related to one another by keys.

A join is an operation that allows you to retrieve the data back easily across the multiple tables. Let's first normalize this data. The table structure is clean and other than the ID primary key, no data is duplicated. Using two tables would force two selects to retrieve the complete contact information. Let's look at how this would be done using the SimpleDB principles. No joins SimpleDB does not support the concept of joins.

Instead, SimpleDB provides you with the ability to store multiple values for an attribute, thus avoiding the necessity to perform a join to retrieve all the values. Unlike a delimited list field, SimpleDB indexes all values enabling an efficient search each value. You don't have to create schemas, change schemas, migrate schemas to a new version, or maintain schemas. This is yet another thing that is difficult for some people from a traditional relational database world to grasp, but this flexibility is one of the keys to the power of scaling offered by SimpleDB.

You can store any attribute-value data you like in any way you want. If the requirements for your application should suddenly change and you need to start storing data on a customer's Twitter handle for instance, all you need to do is store the data without worrying about any schema changes!

Let's add an e-mail address to the database in the previous example. In the relational database, it is necessary to either add e-mail to the phone table with a type of contact field or add another field. In SimpleDB, there is no concept of a column in a table. The spreadsheet view of the SimpleDB data was done for ease of readability, not because it reflects the data structure.

SQL has evolved over the years into a highly complex language that can do a vast variety of things to your database. SimpleDB does not support the complete SQL language, but instead it lets you perform your data retrieval using a much smaller and simpler subset of an SQL-like query language.

This simplifies the whole process of querying your data. This simplified textual data makes it easy for SimpleDB to automatically index your data and give you the ability to retrieve the data very quickly. If you need to store and retrieve other kinds of data types such as numbers and dates, you must encode these data types into strings whose lexicographical ordering will be the same as your intended ordering of the data.

As SimpleDB does not have the concept of schemas that enforce type correctness for your domains, it is the developer's responsibility to ensure the correct encoding of data before storage into SimpleDB. Working only in strings impacts two aspects of using the database: queries and sorts. Selecting all records sorted by Quantity will return the order , , , , Dates present an easier problem, as they can be stored in ISO format to enable sorting as well as predictable searching.

We will cover this in detail in Chapter 5, Data Types. Updates are done to a central database, but reads can be done from many read-only database slave servers. SimpleDB keeps multiple copies of each domain. Whenever data is written or updated within a domain, first a success status code is returned to your application, and then all the different copies of the data are updated. The propagation of these changes to all of the nodes at all the storage locations might take some time, but eventually the data will become consistent across all the nodes. SimpleDB provides this assurance only of eventual consistency for your data.

This means that the data you retrieve from SimpleDB at any particular time may be slightly out of date. The main reason for this is that SimpleDB service is implemented as a distributed system, and all of the information is stored across multiple physical servers and potentially across multiple data centers in a completely redundant manner. This ensures the large-scale ready accessibility and safety of your data, but comes at the cost of a slight delay before any addition, alteration, or deletion operations you perform on the data being propagated throughout the entire distributed SimpleDB system.

Your data will eventually be globally consistent, but until it is consistent, the possibility of retrieving slightly outdated information from SimpleDB exists. Mocky tours the SimpleDB platform and APIs, explains their essential characteristics and tradeoffs, and helps you determine whether your applications are appropriate for SimpleDB. Next, he walks you through all aspects of writing, deploying, querying, optimizing, and securing Amazon SimpleDB applications-from the basics through advanced techniques.

Throughout, Mocky draws on his unsurpassed experience supporting developers on SimpleDB's official Web forums. He offers practical tips and answers that can't be found anywhere else, and presents extensive working sample code-from snippets to complete applications.