jueves, 23 de julio de 2009

Google AppEngine 99.9% Up-time With ORACLE?



I am testing a little application on Google AppEngine to send Twitter updates to my cellphone. For two-weeks this free cloud-computing hosting has just worked perfectly. But a couple of days ago it throwed a strange error (check it below). The system makes a cron web request every 2 minutes. That is a honorable 99.9% aprox up-time! Apparently they say I consumed some quota but I was using almost nothing of it. What is weirder is that I received an ORACLE error on my cellphone!

ORA-00604: error occurred at recursive SQL level 1
ORA-02067: transaction or savepoint rollback required
ORA-02067: transaction or savepoint rollback required


This is the AppEngine detailed error I get from the webapp cloud log. You can see that the error raises from the AppEngine DataStore, maybe is a limitations not observable from the application dashboard and it's quotas. But on the other side the ORACLE error codes indicate that the problem is a concurrency bug in the DB.

The lesson we learned from Cloud Computing is that you can't debug or report this kind of errors because you don't know who is responsable (in this case Google, Twitter, Claro-phoneprovider or me?), besides the inability to replicate them.


07-20 01:08PM 19.632
/broadcast/realtime
500

4913ms

8181cpu_ms
8013api_cpu_ms

0kb

0.1.0.1 - - [20/Jul/2009:13:08:24 -0700] "GET /broadcast/realtime HTTP/1.1" 500 84 - - "twittus.appspot.com"




  • E 07-20 01:08PM 24.533

    Traceback (most recent call last):
    File "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 501, in __call__
    handler.get(*groups)
    File "/base/data/home/apps/twittus/1.335015795539495654/broadcast.py", line 59, in get
    if get_status().status == 0:
    File "/base/data/home/apps/twittus/1.335015795539495654/broadcast.py", line 20, in get_status
    for s in TwittusStatus().all().fetch(1):
    File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1426, in fetch
    raw = self._get_query().Get(limit, offset)
    File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 959, in Get
    return self._Run(limit, offset)._Get(limit)
    File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 903, in _Run
    _ToDatastoreError(err)
    File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 2055, in _ToDatastoreError
    raise errors[err.application_error](err.error_detail)
    Timeout



  • 2 comentarios:

    1. I'm having the same problema in Java:
      java.sql.SQLException: ORA-00604: error occurred at recursive SQL level 1
      ORA-02067: transaction or savepoint rollback required
      ORA-02067: transaction or savepoint rollback required


      In test environment, it never appeared (low traffic).
      In production environment, the system works correctly, but after a few thousand of connections (15 to 45 min) the problem showed up, and when it started, it was recursive, until I restart the app or environment.

      Regards

      ResponderEliminar
    2. I am glad to know I am not mad, ORACLE under the app engine... :D

      ResponderEliminar