Doing it right: Moving to AWS Part 1

2011-01-17 19:00:00 -0500


Greetings folks, I'm Erik Hollensbe, recent Zetetic recruit. One of my first tasks here has been to move Tempo and this site, and our marketing sites for products like Strip and Connect to AWS, specifically EC2.

Why EC2?

I'm going to presume you've heard the standard-fare cloud rhetoric; so I'm going to just skip that part and go straight into the meat:

EC2 has a remarkable commandline-driven API:

Anything you can do in the web interface (and then some) is represented in unix philsophy — "do one thing and do it well" — java or ruby-based utilities. For example we have a script that takes nightly snapshots of our in-use EBS volumes and then prunes everything but the last 5 of these:

You can see here we are able (for the most part) to cleanly parse and manage each individual step of the process from start to finish, having an easily accessible point anywhere in the script to break it, and check whether the moving parts are functioning in the way we suspect.

For example, ec2-describe-volumes yields output like this:


VOLUME vol-5ab82932 15 snap-ad6f2234 us-east-1d in-use 2010-12-14T17:07:21+0000
ATTACHMENT vol-5ab82932 i-f0370bed /dev/sda1 attached 2010-12-14T17:07:24+0000
TAG volume vol-5ab82932 Name somebox-root

(That's one line)

End-over-end enumerating all your volumes, which can be pulled apart with classic unix dissection utilities like sort, awk, and perl. You can see with the information being supplied to the script above, how the hard work is done for us, leaving us to extract the pieces we care about, culminating in a relatively boilerplate-free experience.

There's 101 tools at a quick count of my ec2-api-tools directory — filtering duplicates for Windows and Legacy systems — and that's not even counting the ec2-ami-tools which are used to build custom systems (more on that later).

EC2 was built by network engineers for network engineers

One of the most useful features of EC2 is its networking stack. EC2 machines are behind a NAT (10/8 spanning all DCs) and assigned external IPs dynamically. One can change the external routable IPs using the "Elastic IP" system. These IP addresses can be attached and detached from systems about as liberally as you attach a thumb drive to machines, giving you maximum flexibility. No more waiting for pesky TTLs to expire or having to change 20 /etc/hosts files or scripts.

EC2 also supports "security groups", which are basically network roles. Roles can be used in firewall rules to create logical partitioning between machines in your network.

For example, you can have a set of machines in the 'webserver' class and another set of machines in the 'database' class that only accept connections on port 5432 (the standard postgresql port) from 'webservers'. This means no matter how many 'webservers' you create, or how many times they change IP addresses, your firewall rules still keep out the hax0rs and let webservers talk to your databases. These rules are port/protocol dependent and the "security group" takes the place of CIDR notation for in/out.

Get as detailed or simple as you want

AMIs, EBS and instance stores, what you do with EC2 is up to you. If you want a classic (oy, calling it classic is a bit much, but hey) VPS, you can run with one of the pre-rolled EBS rooted instances. If you want a shiny new cloud system, you can run with instance stores which, when terminated, disappear forever. You can even roll your own kernels and initrd's to customize your system with the AMI toolkit. We will dive deeper into these in upcoming articles.

"Ok Erik, my brain's getting tired. What can I expect for next time?"

This is going to be a minimum 4-part series (including this one) which will at least cover:

  1. EBS management both on and off the root device — here
  2. PostgreSQL on the cloud - management and backup tips — here
  3. Deploying Web Applications for volatile systems

Thanks for reading! I will be happy to field any comments below. -Erik


Tempo Maintenance 1/17/2011 9pm EST

2011-01-13 19:00:00 -0500


Greetings!

Tempo will be recieving system updates and a new deployment on this upcoming Monday, January 17 at 9pm EST. The entire process should last less than an hour.

If you have any questions or concerns, please get in touch with us.


System Maintenance, Dec 19th, 9PM EST

2010-12-12 19:00:00 -0500


This Sunday the 19th of December, we’ll be conducting a significant bit of infrastructure and system maintenance to support our various websites and our time-tracking service Tempo, which will accordingly affect your ability to access these sites, temporarily.

We’ve scheduled the downtime for 9pm to 12am EST in order to minimize impact on the business day. We’re also doing this mid-month to avoid disrupting our customers’ end-of-month (and end-of-year) billing and reporting.

If you have any questions or concerns, please get in touch with us.


Updating Page Sentry for APEX 4.0 Upgrade

2010-12-09 19:00:00 -0500


We’ve been using Oracle’s Application Express 3.0.x on a client site for about three years now, possibly more. Last week we did the upgrade to Application Express 4.0.2 on a development host, and for the most part it was seamless and our apps worked great. In addition, the new Application Express engine is snappier, easier to work with, and really quite fantastic.

However, we did run into some small troubles that became quite vexing. We use APEX in a single-sign-on environment (Oracle Access Manager is involved), so instead of using APEX’s built-in authentication schemes, we use a custom one we call oamPageSentry, based off of the many examples out there on doing just that (there’s even a kind of unofficial white-paper from Joel Kallman, one of Oracle’s APEX hackers, on creating a “page sentry” authentication scheme in PL/SQL for OAM over here on his blog).

For the sake of brevity, if you’d like to peak at our original page sentry function at the time of the upgrade, in it’s resplendent, mod_plsql glory, check out this gist.

Anyway, the strange behavior we noticed had to do with links used to forward users already authenticated in our SSO environment into a specification application and page in APEX, setting certain required page items in the reports using values passed in through the APEX URL request format. After the upgrade, we saw was that the user was getting a redirect that squashed the Page Item parameters by sending the user to a new session every time, without the original query string preserved.

This seems due not to anything wrong with the original page sentry code (ours or Oracle’s, which were identical in most aspects), but with some slight changes in behavior in APEX to avoid session manipulation, although it was quite tricky to narrow it down.

The primary problem seems to be that when the user visits a session-id-zero link, APEX 4.0 redirects the user to a new version of the URL with a new session ID every time, even if the user already has a session. We found this quite accidentally; we created a simple test app using built-in authentication to narrow things down, and we decided on a lark to include the display of the &APP_SESSION. variable. In other words, for a URL like this:

http://dev.example.com:7777/pls/apex/f?p=106:1:0::YES::P1_FOO:barbar

We’d expect to end up at something like:

http://dev.example.com:7777/pls/apex/f?p=106:1:3343328572473537::YES::P1_FOO:barbar

And on every subsequent request, we should keep landing at that same link with the same session ID (the value 3343328572473537 in the example above). However, that’s not what was happening, we were getting a new session ID every time. Even more curiously, the value being displayed from APP_SESSION never changed, it was always left at the value of the first session ID we were given in that browser session.

Another problem that was happening was that page items we passed over the original session-id-zero link were indeed retained in the redirect URL, but were not used to set the values in session, despite session protection being off. We think this is because APEX won’t properly manipulate the session from url parameters unless the session IDs agree, and in this case, they wouldn’t.

Now, this was all without the page sentry authentication turned on, so we finally had a culprit — this particular behavior would cause the page sentry function to behave somewhat incorrectly. In our page sentry, which attempts to emulate what APEX normally does, for users that already have a cookie, it just registers the existent session and attempts to forward you to the page you originally requested. Except the session IDs no longer match (because a new session has been generated by APEX thanks to session ID zero), so the URL the user is sent to is for a new session, but the parameters/page item values have been stored in the original session and ignored in the new one.

At this point we decided to adapt the more recent page sentry function listed in Joel Kallman’s blog post as our starting point for getting this working correctly, since his seemed more comprehensive and careful of edge-case scenarios. It looks like this:


create or replace function PASSPORT.oamPageSentry ( p_apex_user in varchar2 default 'APEX_PUBLIC_USER' )
return boolean
as
l_cgi_var_name varchar2(100) := 'REMOTE_USER';
l_authenticated_username varchar2(256) := upper(owa_util.get_cgi_env(l_cgi_var_name));
--
l_current_sid number;
begin
-- check to ensure that we are running as the correct database user
if user != upper(p_apex_user) then
return false;
end if;

if l_authenticated_username is null then
return false;
end if;


l_current_sid := apex_custom_auth.get_session_id_from_cookie;
if apex_custom_auth.is_session_valid then
apex_application.g_instance := l_current_sid;
if l_authenticated_username = apex_custom_auth.get_username then
apex_custom_auth.define_user_session(
p_user=>l_authenticated_username,
p_session_id=>l_current_sid);
return true;
else -- username mismatch. unset the session cookie and redirect back here to take other branch
apex_custom_auth.logout(
p_this_app=>v('APP_ID'),
p_next_app_page_sess=>v('APP_ID')||':'||nvl(v('APP_PAGE_ID'),0)||':'||l_current_sid);
apex_application.g_unrecoverable_error := true; -- tell apex engine to quit
return false;
end if;

else -- application session cookie not valid; we need a new apex session
apex_custom_auth.define_user_session(
p_user=>l_authenticated_username,
p_session_id=>apex_custom_auth.get_next_session_id);
apex_application.g_unrecoverable_error := true; -- tell apex engine to quit
--
if owa_util.get_cgi_env('REQUEST_METHOD') = 'GET' then
wwv_flow_custom_auth.remember_deep_link(p_url => 'f?'|| wwv_flow_utilities.url_decode2(owa_util.get_cgi_env('QUERY_STRING')));
else
wwv_flow_custom_auth.remember_deep_link(p_url=>'f?p='||
to_char(apex_application.g_flow_id)||':'||
to_char(nvl(apex_application.g_flow_step_id,0))||':'||
to_char(apex_application.g_instance));
end if;
-- -- register session in APEX sessions table,set cookie,redirect back
apex_custom_auth.post_login(
p_uname => l_authenticated_username,
p_session_id => nv('APP_SESSION'),
p_app_page => apex_application.g_flow_id||':'||nvl(apex_application.g_flow_step_id,0));
return false;
end if;
end oamPageSentry;
/

The first thing we needed to do with the above listing is put in a check for the session ID mis-match caused by linking the user to a page with session zero. If we have the “current”, or originally created session ID in the user’s cookie, we can check to see whether or not it matches what is in the QUERY_STRING, and if they don’t match, redirect the user to the very same URL but with the “current” session ID:


l_current_sid := apex_custom_auth.get_session_id_from_cookie;
l_url := wwv_flow_utilities.url_decode2(owa_util.get_cgi_env('QUERY_STRING'));

-- split on zero or more non-colon characters
l_url_sid := REGEXP_SUBSTR(l_url, '[^:]*', 1, 5);

-- does the current sid match the sid in the url?
if owa_util.get_cgi_env('REQUEST_METHOD') = 'GET' AND l_current_sid <> TO_NUMBER(l_url_sid) then
-- nope, so let's go to the correct url, with the current sid
wwv_flow.debug('oldurl: ' || l_url);
l_url := REGEXP_REPLACE(l_url, '^(p=.+?:.+?):*\d*(.*)$', '\1:' || l_current_sid || '\2');
wwv_flow.debug('newurl: ' || l_url);
owa_util.redirect_url('f?'|| l_url);
return false;
end if;

Now, if the user comes in on session 0 from a link, and they already have a session, but have been redirected to a new/different session ID, page sentry fixes it and redirects the user to the appropriate URL.

However, there are other problematic edge cases. Most Stephen took care of with the clever regex matching used above to identify the session ID if present in the QUERY_STRING, but the real bugger was post_login. In previous versions of APEX, we believe that the remember_deep_link call would cause any subsequent call to post_login to redirect to the user to the target URL. This doesnt appear to be the case in APEX 4.0.

To account for this, we updated the call to pass the target page in to the post_login call directly. We found that post_login will blindly append the session ID to the end of p_app_page parameter string when it redirects, and we can clean that up with the another cleanup-redirect at the beginning of our page sentry function:


-- the post_login call at the end of this function will blindly append the session ID to the URL, even if it is
-- a deep link. Detect this condition, strip the duplicate session identifier, and redirect.
if REGEXP_SUBSTR(l_url, '^.*:' || l_current_sid || ':.+:' || l_current_sid || '$') IS NOT NULL then
l_url := REGEXP_REPLACE(l_url, ':' || l_current_sid || '$', '');
wwv_flow.debug('oamPageSentry: identified duplicate session id on URL, stripping and redirecting to ' || l_url);
owa_util.redirect_url('f?'|| l_url);
return false;
end if;

This set of fixes has gotten us through, straightening out all the redirecting and link remembering that needs to be done when our users come in to link in our pages with params they need and session ID 0.

TL;DR

You can grab the full PL/SQL for our implementation of this page sentry function at this gist.


Codebook 1.4.1 Released in App Store

2010-12-01 19:00:00 -0500


As previously noted, we pulled Codebook 1.4.0 from the App Store due to a start-up bug, and rushed to get a fix prepared. The update has been released, Codebook 1.4.1 is now available in the App Store, and all users of Codebook should update as soon as possible. Visit the App Store now to get the update.

If you had installed Codebook 1.4.0, or any other recent update and could no longer start Codebook on your device, this update will restore your access to your data. As we mentioned before, we had done quite extensive testing on Codebook before releasing the last update, but we didn’t do it on enough disparate iOS devices to detect this bug, and will be doing far more extensive testing on all future updates.

Changes

Well, obviously, we fixed the start-up bug.

But the biggest change since Codebook 1.3 is that Codebook 1.4 provides the oft-requested support for multi-tasking: on devices that are running iOS 4 and support multi-tasking, you can now set a timer, allowing Codebook to stay temporarily unlocked while you switch applications. Makes the app a lot more convenient to use on the Subway, I’ll tell you that.

I’ve also ditched the recycled-paper background displayed on the note view/edit screen in favor of a more subtle paper texture that’s far more pleasant.

Coming Soon in Codebook 1.5

Work on Codebook 1.5 was completed recently, and we’re now testing it as mentioned above for any surprises. It includes a ton of usability improvements and a fantastic new feature: Dropbox Sync.

Codebook 1.5 Sync View

The note view/edit screen also got a lot of love, too:

Codebook 1.5 Note View

If the differences aren’t readily apparent: the top-right navigation bar button is no longer the action button kicking up an email compose view. This + button allows you to quickly add a new note — the current note is torn off the view (a la Apple’s MobileNotes) to reveal a fresh page.

The toolbar is a long-overdue addition. Now you can delete what you’re looking at without having to pop back to the list view, or you can hit the Action/Share button. The share button, by the way, no longer just throws up a compose view. Instead it presents a modal sheet providing you the option of sharing the current note as an email. This pause in the process allows us to add more options later and will help cut down on some accidental mailing of notes.

Finally, while you can’t tell from the screenshot here, the animation of presenting the keyboard and re-sizing the text view for editing is greatly improved. It’s no longer jerky, but nice and smooth, and the background doesn’t appear to get squished or stretched, it just remains static, providing a much nicer editing experience.

This is all probably a couple weeks out from appearing in the Store. Hopefully, by Christmas.

Thanks for using Codebook!