share
Stack OverflowHidden limitations of Google App Engine?
[+81] [14] Kyle Cronin
[2009-02-19 15:56:21]
[ google-app-engine bigtable ]
[ http://stackoverflow.com/questions/565963/hidden-limitations-of-google-app-engine ] [DELETED]

I've been looking into writing a web app that will run on Google App Engine, but before I commit myself to the platform I'd like to know what, if any, limitations there are. I'm aware of the basic CPU/bandwidth restrictions that Google places on the free service, but I'm wondering more about development restrictions like how BigTable compares to a standard relational database and what Python libraries aren't available on the GAE platform (and what alternatives Google provides).

Basically I'm looking for any hidden roadblocks before I commit to the platform. Thanks for your help!

(1) if you need SSL on custom domain forget GAE. - mihai
[+41] [2009-02-20 08:43:42] fuentesjr

I think your biggest obstacle (but definitely not a limitation) is overcoming the relational mindset that has become mainstream in the industry. The relational model has its place in this world, but it doesn't solve all problem domains, and you can't depend on it in GAE.

What this means in more practical terms is that you will have to design/architect your applications differently. In several scenarios the typical and norm you have been acusstomed to will not apply. Perhaps one of the most obvious examples is that of shared counters. Anyway, all this is hard to see until you have concrete examples. However for starters, I have compiled a playlist [1] of some of the most helpful App Engine sessions that were presented at Google I/O last year. I would advise you to see them as they do a great job at helping one understand how the platform works.

Update: Just recently on Google App Engine's Blog they describe non-relational databases [2]. I think this will help to give more context.

[1] http://www.youtube.com/view_play_list?p=DA31F43DE4107B05
[2] http://googleappengine.blogspot.com/2009/02/back-to-future-for-data-storage.html

(1) Thank you for this list, all great resources in one place. - zgoda
what problem domains relational model doesn't solve? - lubos hasko
1
[+25] [2009-02-21 11:10:36] alexpopescu

There are a lot of constraints in the google app engine and it is quite difficult to list them all.

A good start for you might be to check the list of most voted feature requests and figure out if your project can work without those: http://code.google.com/p/googleappengine/issues/list. There are other small issues related to the dashboard and inconsistencies between the SDK and the public application, but most of these can be avoided.

One thing that I have found quite annoying is the fact that there is no 'blueprint' for building apps in GAE. Basically, even for quite simple web apps you'll find out that there isn't a known path to decide what is the best way to structure your data and get the best performance out of it, nor are there simple ways to profile and understand how to circumvent these.

Building an app deployable on GAE is a fundamental shift from the traditional way of writing apps and while it is exciting, I do think that it might require more effort on your side.

./alex

PS: I'm currently working on a GAE based project, that is deployed here: http://the.dailycloud.net


2
[+25] [2010-06-18 09:03:58] systempuntoout

Having developed a couple of web application with GAE, I would add :

  1. You have a 10 seconds max Deadline on download with UrlFetch [1]
  2. 30 Seconds Deadline for web requests
  3. 1 Megabyte max for single put()
  4. API currently offers a fairly limited functionality for text-search [2] NOW FIXED [3]
  5. Portability (Open source compatiable platforms: AppScale [4] and Capedrawf [5] available)
  6. Deploy environment does not replicate the same restrictions of the Production (this can be painful for testing. Cfr. "works on my machine")
  7. Random system failure (be prepared to write a lot of defensive code because, sometimes, system does not work as espected).
  8. No JOINS
  9. No Count (You need to code your statistics upfront)
  10. Limited customer support ( App Engine Google Groups [6] has a lot of unanswered questions )
  11. Python 2.5!
  12. Throttling and quota problems calling third party APIs (ie Twitter, Facebook) due to the shared IPs with other Apps.
  13. No C extensions * (not all)
  14. No writes to file system (it can be done using the file api and blobstore)
  15. No straightforward massive update (you are forced to use map reduce [7])
[1] http://code.google.com/intl/it/appengine/docs/python/urlfetch/overview.html
[2] http://code.google.com/p/googleappengine/issues/detail?id=217
[3] https://developers.google.com/appengine/docs/python/search/overview
[4] http://appscale.com
[5] http://www.jboss.org/capedwarf
[6] http://groups.google.com/group/google-appengine
[7] http://code.google.com/p/appengine-mapreduce/

3
[+17] [2009-02-23 12:43:04] husayt

There are a number

of limitations, though it is a promising product.

The most ironic of them is that GAE doesn't didn't support TEXT SEARCH in its database Api.

I thought google was all about search, and mostly text search.

UPDATE: at last Google is adding Full-text Search [1]

At last we are adding a full text search service to App Engine. The upcoming service will be built on top of the very infrastructure used by Google. In addition to full text search queries we will also offer numeric, geo, date search capabilities, and much more. This session will cover the basic full text search API, briefly outline more advanced features, and how full text search ties to existing services such as datastore

[1] http://www.google.com/events/io/2011/sessions/full-text-search.html

(1) I heard that this is on the way soon! - Liam
4
[+15] [2009-12-05 19:00:44] Tahir Akram

I have identified/researched following limitations so far.

  • no HTTPS for custom domains. Only for your-app-id.appspot.com domains
  • no streaming and long term connections
  • a web request must respond in 30 senconds otherwise GAE throw DeadlineExceededException
  • application will not work with your naked custom domain. i.e http://yourdomain.com
  • only HTTP and HTTPS. Client can not connect to GAE through FTP

(8) +1. Lack of SSL/HTTPS for custom domains (code.google.com/p/googleappengine/issues/detail?id=792) is a biggie for those of us who would like to create any kind of "enterprisey" software on App Engine. - Jonik
There's no built-in support for HTTP sessions, although a few external libraries do provide this. - quikchange
@quickchange you have session with GaeJ. - systempuntoout
(1) On 2010-12-02, Google removed the 30 second deadline: googleappengine.blogspot.com/2010/12/… - Trenton
(1) Point 1: No HTTPS for custom domain is not correct anymore: Read developers.google.com/appengine/docs/ssl - Viral Patel
5
[+7] [2009-10-01 07:58:02] Richard Watson

Performance will surprise you. GAE is optimized for many tiny queries and you get warned if a query takes any CPU time at all. You get 6.5 (at last check) free hours per day, but it's a mystical number and you should test.

You'll find that time as you measure it doesn't relate to the CPU or datastore CPU time, because (for example) under the covers there might be multiple machines updating indexes during deletes/updates. Some users have found huge CPU usage when uploading bulk data - many hours of usage for e.g. 20 min of real time.

Your Java instance might need to be powered-up if it hasn't been hit in (I think) 20 minutes. The benefit is that they can pass their smart management on to you as cheaper costs, but it does mean you'll experience a short delay, and see a high CPU warning on the first request in a while.

For many cases, Python datastore access is faster than Java JDO. You'll likely find that using the low-level API for Java faster.

Some developers seem to have experienced more datastore errors thank you would expect (around 0.4-1% maybe?). I haven't yet.


6
[+4] [2009-02-19 19:18:35] Scott Markwell

I explored Google AppEngine for my own amusement a while ago, making a lunch 8ball in the process.

http://blurry-lunch.appspot.com/

The system is very easy to work with, and to hit the ground running.

Limitations I saw in Bigtable mostly revolved around dataset size and access time. Part of the application I was making would randomly choose a location out of a list of locations, to due this I am pulling out the complete list of locations, then selecting a random element in python. As the default index is a non-linear GUID, and I didn't bother to setup a separate attribute in the object, as to find the next available id in the system isn't how DataStore is designed.

The problem is, if you need linear access to a massive amount of records to perform an operation, you could run into a limit where your request takes too long, but what data set would cause such a delay isn't clearly defined, as google's systems are massively clustered.

As far as external Python Libraries, you should be fine, as long as the calls they make are python based. You will have to bundle them yourselves into the directory structure that gets uploaded for deployment.

You should be aware that you will be locking your self to their platform, as there is no production ready system that supports their API. They have a standalone appserver for debug purposes, but it certainly isn't suitable for an actual deployment.

Another thing is that GAE is still in beta, with no commercial support options, and you could not run a wildly successful web application currently without a commercial plan. The limitations are too low to even survive a slashdotting for static content.

EDIT: Of course this is all as of Feb 19th, 2009. This could all change wildly, or GAE could even be turned off.


Even though there is no commercial support as of now, it is not true that you can't have a wildly successful web application. googleappengine.blogspot.com/2008/10/… - fuentesjr
Hmmm, that particular incident I saw through slashdot may have been a one off, or a badly written site. They were pushing a bit of static data, so may have gone beyond what the daily bandwidth allowance. - Scott Markwell
I'm sure they definitely have gone beyond the daily allowance. giftag.com also i'm sure has gone over daily limits. The catch here is that if your app is "catching on" and you send the app engine team requests for more quota for your app, they usually grant it. - fuentesjr
Hey fuentesjr: How your giftag.com is forwarded to www.giftag.com if it hits through naked url? - Tahir Akram
7
[+2] [2009-02-20 08:09:21] zgoda

If you consider the situation of moving to different hosting platform, try to isolate things that are unique to GAE (mostly storage and authentication) from your app core logic — this way you'd be able to swap them with relatively smaller amount of work. And do not use their WebApp framework (it's really basic anyway), stick with widely adopted tools like Django [1] or some WSGI toolbox (eg. Werkzeug [2]). Speaking of the limits: it's always desirable to make your app responsive, so careful coding to not hit timeout limits is a Good Thing™. :)

Other than that, I found developing for GAE really fun and entertaining, perhaps due to exotic nature of platform. ;)

[1] http://www.djangoproject.com/
[2] http://werkzeug.pocoo.org/

8
[+2] [2009-02-21 11:45:42] myroslav

There are many limitations that are not marketed by Google, but you'll hit them sooner or later ;). One of them was just posted at http://stackoverflow.com/questions/572780/cpython-internal-structures.

Many limitations can be overcome by changes in algorithms, that will do good for your application anyway. For instance, they recently raised timeout from 10 sec to 30 sec for total request processing (you can spend that time in many ways that doesn't consume other resources, the simplest is querying external system). Changes you do to your application to fit into 30 seconds will make your app better!

And similar approach applies to many other limitations that are there. Try, push to limits, and see if it fits you. Good luck!


9
[+2] [2010-09-13 20:21:11] xamde

In addition to the other answers, there is a maximum of 5000 indexed properties per entity (source: http://www.youtube.com/watch?v=AgaL6NGpkB8&feature=player_embedded# [1]!) but it really seems to be a limit of 5000 indexed property values per entity.

I summed up more limits of the datastore here http://code.google.com/p/xydra/wiki/AppEngine

[1] http://www.youtube.com/watch?v=AgaL6NGpkB8&feature=player_embedded#

10
[+2] [2010-11-19 04:37:28] CyberFonic

You are not limited to using Google's hosting of AppEngine. For example with AppScale [1] you can migrate to Amazon EC2 or your own colo servers.

When you think about it, the restrictions relative to standard Python are all to do with avoiding security holes. If your application really does require unavailable features you could look at Amazon's new micro-instances, now free for the first year for the first one. With any good VPS you are in total control. For example using NodeJS - which seems to be a very credible alternative.

[1] http://code.google.com/p/appscale/

Do you know if with appscale on amazon ec2 the app engine restrictions such as "You have a 10 seconds max Deadline on download with UrlFetch" are removed? - Fgblanch
(1) Sorry I can't answer your question. For my projects I've moved to using NodeJS on EC2 - far fewer limitations. Pity because I do like Python, but surprisingly find CoffeeScript awkward due to the need to still grok the translated JS code. - CyberFonic
11
[+2] [2010-11-19 06:41:02] hasanatkazmi

Database: there are many issues with that notably, you can't have Join in queries Portability: Code written for GAE can't be ported anywhere else (unless you use django or web2py; but they also have limitations on GAE)


12
[+1] [2010-11-19 09:39:45] Phil

I'd like to know why you want to use GAE rather than a proper database backed site.

I built a couple of test projects on GAE, and both times things went smoothly initially then suddenly hit a stupid roadblock. Missing features is one problem, reliability is another.

Backend writes fail often enough to cause trouble, and are well documented.

I scrapped my last GAE project when my site suddenly started serving 404 pages overnight. I hadn't changed anything and there are no logs for debugging. If it was a live production site, what would you do in those circumstances? You're basically stuffed.


There are logs for debugging in the admin console. - Mark Ellul
(1) Sorry, I meant 'there was nothing useful in the logs.' In a real lamp stack you can just go in and find out what the problem is, whether that involves looking at the logs, turning up the log level or poking around it with unix tools. This guy has some more troubles: carlosble.com/?p=719 - Phil
13
[0] [2010-11-19 13:11:25] Mark Ellul

One of the limitations that concerned me was the limits with respect to updating entities. I cannot remember where I read it and I will update the post if I find it.

The Restriction is that you can only write to the same entity 5 times per second.

If someone has heard of this restriction and has a link please add as a comment.

Another Limitation is support, there is an error at the moment which causes 500 errors randomly and some apps need to upload their code again to get the app working (see the django-nonrelation google group to see what I am talking about). There is no word when this will be fixed.

Also once I tried an app to run some integration with Salesforce... and because salesforce took more than 10 seconds to service the request, though that limit has been upped to 30s... it failed.


code.google.com/appengine/articles/sharding_counters.html, from 1st paragraph: "you can only expect to update any single entity or entity group about five times a second" - Irit Malka
14