2012-09-25
(cloud, production, juju)
Written by Mark Mims and Chris Johnston
Hey, so last month we ran scheduling for the Linux Plumbers Conference entirely on juju!
Here’s a little background on the experience.
Along the way, we’ll go into a little more detail about running juju in production than the particular problem at hand might warrant. It’s a basic stack of services that’s only alive for 6-months or so… but this discussion applies to bigger longer-running production infrastructures too, so it’s worth going over here.
The App
So summit is this great django app built for scheduling conferences. It’s evolved over time to handle UDS-level traffic and is currently maintained by a Summit Hackers team that includes Chris Johnston and Michael Hall.
Chris contacted me to help him use juju to manage summit for this year’s Plumbers conference. At the time we started this, the 11.10 version of juju wasn’t exactly blessed for production environments, but we decided it’d be a great opportunity to work things out.
The Stack
A typical summit stack’s got postgresql, the django app itself, and a memcached server.
We additionally talked about putting this all behind some sort of a head like haproxy.
This’d let the app scale horizontally as well as give us a stable point to attach an elastic-ip. We decided to not do this at the time b/c we could most likely handle the peak conference load with a single django service-unit provided we slam select snippets of the site into memcached.
This turned out to be true load-wise, but it really would’ve been a whole lot easier to have a nice constant haproxy node out there to tack the elastic-ip to. During development (charm, app, and theme) you want the freedom to destroy a service and respawn it without having to use external tools to go around and attach public IP addresses to the right places. That’s a pain. Also, if there’s a sensitive part of this infrastructure in production, it wouldn’t be postgresql, memcached, or haproxy… the app itself would be the most likely point of instability, so it was a mistake to attach the elastic-ip there.
The Environment
choice of cloud
We chose to use ec2 to host the summit stack… mostly a matter of convenience. The juju openstack-native provider wasn’t completed when we spun up the production environment for linuxplumbers and we didn’t have access to a stable private ubuntu cloud running the openstack-ec2-api at the time. All of this has subsequently landed, so we’d have more options today.
the charms
We forked Michael Nelson’s excellent django charm to create a summit-charm and freely specialized it for summit.
Note that we’re updating this charm for 12.04 here, but this will probably go away in the near future and we’ll just use a generic django charm. It turns out we didn’t do too much here that won’t apply to django apps in general, but more on that another time.
There was nothing special about our tuning of postgresql or memcached. We just used the services provided by the canned charms. These sort of peripheral services aren’t the kind of charms you’re likely to be making changes to or tweaking outside of their exposed config parameters. I know jack about memcached, so I’ll defer to the experts in this regard. Similarly for postgresql… and haproxy if we used it in this stack.
The summit charm is a little different. It’s something we were continuing to tweak during development. Perhaps with future more generic django charm versions, we won’t need to tweak the charm itself… just configure it.
We used a “local” repository for all charms because the charm store hadn’t landed when we were setting this up. Well, now that the charm store is live, you can just deploy the canned charms straight from the store
`juju deploy -e summit memcached`
and keep the ones you want to tweak in a local repository…
`juju deploy -e summit --repository ~/charms local:summit`
all within the same environment. It works out nicely.
control environment
We had multiple people to manage the production summit environment. What’s the best way to do that? It turns out juju supports this pretty well right out of the box. There’s an environment config for the set of ssh public keys to inject into everything in the environment as it starts up… you can read more about that on askubuntu.
Note that this is only useful to configure at the beginning of the stack. Once you’re up, adding keys is problematic. I don’t even recommend trying b/c of the risk of getting undetermined state for the environment. i.e., different nodes with different sets of keys depending on when you changed the keys relative to what actions you’ve performed on the environment. It’s a problem.
What I recommend now is actually to use another juju environment… (and no, we’re not paid to promote cloud providers by the instance :) I wish! ) a dedicated “control” environment. You bootstrap it, then set up a juju client that controls the main production environment. Then set up a shared tmux session that any of the admins for the production environment can use:
Adding/changing the set of admin keys is then done in a single place. This technique isn’t strictly necessary, but it was certainly worth it here with different admins having various different levels of familiarity with the tools. I started it as a teaching tool, left it up because it was an easy control dashboard, and now recommend it because it works so well.
it’s chilly in here
Yeah, so during development you break things. There were a couple of times using 11.10 juju that changes to juju core prevented a client from talking to an existing stack. Aargh! This wasn’t going to fly for production use.
The juju team has subsequently done a bunch to prevent this from happening, but hey we needed production summit working and stable at the time. The answer… freeze the code.
Juju has an environment config option juju-origin to specify where to get the juju installed on all instances in the environment. I branched juju core to lp:~mark-mims/juju/running-summit and just worked straight from there for the lifetime of the environment (still up atm). Easy enough.
Now the tricky part is to make sure that you’re always using the lp:~mark-mims/juju/running-summit version of the juju cli when talking to the production summit environment.
I set up
#!/bin/bash
export JUJU_BRANCH=$HOME/src/juju/running-summit
export PATH=$JUJU_BRANCH/bin:$PATH
export PYTHONPATH=$JUJU_BRANCH
which my tmuxinator config sources into every pane in my summit tmux session.
This was also done on the summit-control instance so it’s easy to make sure we’re all using the right version of the juju cli to talk to the production environment.
backups
The juju ssh subcommand to the rescue. You can do all your standard ssh tricks…
juju ssh postgresql/0 'su postgres pg_dump summit' > summit.dump
… on a cronjob. Juju just stays out of the way and just helps out a bit with the addressing. Real version pipes through bzip2 and adds timestamps of course.
Of course snapshots are easy enough too via euca2ools, but the pgsql dumps themselves turned out to be more useful and easy to get to in case of a failover.
debugging
The biggest debugging activity during development was cleaning up the app’s theming. The summit charm is configured to get the django app itself from one application branch and the theme from a separate theme branch.
So… ahem… “best practice” for theme development would’ve been to develop/tweak the theme locally, then push to the branch. A simple
juju set --config=summit.yaml summit/0
would update config for the live instances.
Well… some of the menus from the base template used absolute paths so it was simpler to cheat a bit early in the process to test it all in-place with actual dns names. Had we been doing this the “right” way from the beginning we would’ve had much more confidence in the stack when practicing recovery and failover later in the cycle… we would’ve been doing it all since day one.
Another thing we had to do was manually test memcached. To test out caching we’d ssh to the memcached instance, stop the service, run memcached verbosely in the foreground. Once we determined everything was working the way we expected, we’d kill it and restart the upstart job.
This is a bug in the memcached charm imo… the option to temporarily run verbosely for debugging should totally be a config option for that service. It’d then be a simple matter of
juju set memcached/0 debug=true
and then
juju ssh memcached/0
to watch some logs. Once we’re convinced it’s working the way it should
juju set memcached/0 debug=false
should make it performant again.
Next time around, we should take more advantage of juju set config to update/reconfigure the app as we made changes… and generally implement a better set of development practices.
monitoring
Sorely lacking. “What? curl doesn’t cut it?”… um… no.
planning for failures
Our notion of failover for this app was just a spare set of cloud credentials and a tested recovery plan.
The plan we practiced was…
-
bootstrap a new environment (using spare credentials if necessary)
-
spin up the summit stack
-
ssh to the new postgresql/0 and drop the db (Note: the postgresql charm should be extended to accept a config parameter of a storage url, S3 in this case, to slurp the db backups from)
-
restore from offsite backups… something along the lines of
cat summit-$timestamp.dump.bz2 | juju ssh -e failover postgresql/0 ‘bunzip2 -c | su - postgres pgsql summit’
In practice, that took about 10-15minutes to recover once we started acting. Given the additional delay between notification and action, that could spell an hour or two of outtage. That’s not so great.
Juju makes other failover scenarios cheaper and easier to implement than they used to be, so why not put those into place just to be safe? Perhaps the additional instance costs for hot-spares wouldn’t’ve been necessary for the entire 6-months of lead-time for scheduling and planning this conference, but they’d certainly be worth the spend during the few days of the event itself. Juju sort of makes it a no-brainer. We should do more posts on this one issue… the game has changed here.
Lessons Learned
What would we do differently next time? Well, there’s a list :).
- use the stable ppa… instead of freezing the code
- sit the app behind haproxy
- use s3fs or equivalent subordinate charm to manage backups instead of just sshing them off the box
- better monitoring… we’ve gotten a great set of monitoring charms recently… thanks Clint!
- log aggregation would’ve been a little bit of overkill for this app, but next time might warrant it
- it’s cheap to add failover with juju… just do it
- maybe follow a development process a little more carefully next time around :)
- we’ll soon have access to a production-stable private ubuntu cloud for these sorts of apps/projects
2012-06-04
(cloud, hadoop, juju)
Written by Mark Mims and James Page
Lately we’ve been fleshing out our testing frameworks for Juju and Juju Charms. There’s lots of great stuff going on here, so we figured it’s time to start posting about it.
First off, the coolest thing we did during last month’s Ubuntu Developer Summit (UDS) was get the go-ahead to spend more time/effort/money scale-testing Juju.
The Plan
-
pick a service that scales
-
spin up a cluster of units for this service
-
try to run it in a way that actively engages all units of the cluster
-
repeat:
- instrument
- profile
- optimize
- grow
James, Kapil, Juan, Ben, and Mark sat down over the course of a couple of nights at UDS to take a crack at it. We chose Hadoop. We started with 40 nodes and iterated up 100, 500, 1000 and 2000. Here’re some notes on the process.
Hadoop
Hadoop was a pretty obvious choice here. It’s a great actively-maintained project with a large community of users. It scales in a somewhat known manner, and the hadoop charm makes it super-simple to manage. There are also several known benchmarks that are pretty straightforward to get going, and distribute load throughout the cluster.
There’s an entire science/art to tuning hadoop jobs to run optimally given the characteristics of a particular cluster. Our sole goal in tuning hadoop benchmarks was to engage the entire cluster and profile juju during various activities throughout an actual run. For our purposes, we’re in no hurry… a slower/longer run gives us a good profiling picture for managing the nodes themselves under load (with a sufficient mix of i/o -vs- cpu load).
EC2
Surprisingly enough, we don’t really have that many servers just lying around… so EC2 to the rescue.
Disclaimer… we’re testing our infrastructure tools here, not benchmarking hadoop in EC2. Some folks advocate running hadoop in a cloudy virtualized environment… while some folks are die-hard server huggers. That’s actually a really interesting discussion. It comes down to the actual jobs/problems you’re trying to solve and how those jobs fit in your data pipeline. Please note that we’re not trying to solve that problem here or even provide realistic benchmarking data to contribute to the discussion… we’re simply testing how our infrastructure tools perform at scale.
If you do run hadoop in EC2, Amazon’s Elastic Map Reduce service is likely to perform better at scale in EC2 than just running hadoop itself on general purpose instances. Amazon can do all sorts of stuff internally to show hadoop lots of love. We chose not to use EMR because we’re interested in testing how juju performs with generic Ubuntu Server images, not EMR… at least for now.
Note that stock EC2 accounts limit you to something like 20 instances. To grow beyond that, you have to ask AWS to bump up your limits.
Juju
We started scale testing from a fresh branch of juju trunk… what gets deployed to the PPA nightly… this freed us up to experiment with live changes to add instrumentation, profiling information, and randomly mess with code as necessary. This also locks in the branch of juju that the scale testing environment uses.
As usual, juju will keep track of the state of our infrastructure going forward and we can make changes as necessary via juju commands. To bootstrap and spin up the initial environment we’ll just use shell scripts wrapping juju commands.
Spinning up a cluster
These scripts are really just hadoop versions of some standard juju demo scripts such as those used for a simple rails stack or a more realistic HA wiki stack.
The hadoop scripts for EC2 will get a little more complex as we grow simply because we don’t want AWS to think we’re a DoS attack… we’ll pace ourselves during spinup.
From the hadoop charm’s readme, the basic steps to spinning up a simple combined hdfs and mapreduce cluster are:
juju bootstrap
juju deploy hadoop hadoop-master
juju deploy -n3 hadoop hadoop-slavecluster
juju add-relation hadoop-master:namenode hadoop-slavecluster:datanode
juju add-relation hadoop-master:jobtracker hadoop-slavecluster:tasktracker
which we expand on a bit to start with a base startup script that looks like:
#!/bin/bash
juju_root="/home/ubuntu/scale"
juju_env=${1:-"-escale"}
###
echo "deploying stack"
juju bootstrap $juju_env
deploy_cluster() {
local cluster_name=$1
juju deploy $juju_env --repository "$juju_root/charms" --constraints="instance-type=m1.large" --config "$juju_root/etc/hadoop-master.yaml" local:hadoop ${cluster_name}-master
juju deploy $juju_env --repository "$juju_root/charms" --constraints="instance-type=m1.medium" --config "$juju_root/etc/hadoop-slave.yaml" -n 37 local:hadoop ${cluster_name}-slave
juju add-relation $juju_env ${cluster_name}-master:namenode ${cluster_name}-slave:datanode
juju add-relation $juju_env ${cluster_name}-master:jobtracker ${cluster_name}-slave:tasktracker
juju expose $juju_env ${cluster_name}-master
}
deploy_cluster hadoop
echo "done"
and then manually adjust this for cluster size.
Configuring Hadoop
Note that we’re specifying constraints to tell juju to use different sized ec2 instances for different juju services. We’d like an m1.large for the hadoop master
juju deploy ... --constraints "instance-type=m1.large" ... hadoop-master
and m1.mediums for the slaves
juju deploy ... --constraints "instance-type=m1.medium" ... hadoop-slave
Note that we’ll also pass config files to specify different heap sizes for the different memory footprints
juju deploy ... --config "hadoop-master.yaml" ... hadoop-master
where hadoop-master.yaml looks like
# m1.large
hadoop-master:
heap: 2048
dfs.block.size: 134217728
dfs.namenode.handler.count: 20
mapred.reduce.parallel.copies: 50
mapred.child.java.opts: -Xmx512m
mapred.job.tracker.handler.count: 60
# fs.inmemory.size.mb: 200
io.sort.factor: 100
io.sort.mb: 200
io.file.buffer.size: 131072
tasktracker.http.threads: 50
hadoop.dir.base: /mnt/hadoop
and
juju deploy ... --config "hadoop-slave.yaml" ... hadoop-slave
where hadoop-slave.yaml looks like
# m1.medium
hadoop-slave:
heap: 1024
dfs.block.size: 134217728
dfs.namenode.handler.count: 20
mapred.reduce.parallel.copies: 50
mapred.child.java.opts: -Xmx512m
mapred.job.tracker.handler.count: 60
# fs.inmemory.size.mb: 200
io.sort.factor: 100
io.sort.mb: 200
io.file.buffer.size: 131072
tasktracker.http.threads: 50
hadoop.dir.base: /mnt/hadoop
Note also that we also have our juju environment configured to use instance-store images… juju defaults to ebs-rooted images, but that’s not a great idea with hdfs. You specify this by adding a default-image-id into your ~/.juju/environments.yaml file. This gave each of our instances an extra ~400G local drive on /mnt… hence the hadoop.dir.base of /mnt/hadoop in the config above.
40 nodes and 100 nodes
Both the 40-node and 100-node runs went as smooth as silk. The only thing to note was that it took a while to get AWS to increase our account limits to allow for 100+ nodes.
500 nodes
Once we had permission from Amazon to spin up 500 nodes on our account, we initially just naively spun up 500 instances… and quickly got throttled.
No particular surprise, we’re not specifying multiplicity in the ec2 api, nor are we using an auto scaling group… we must look like a DoS attack.
The order was eventually fulfilled, and juju waited around for it. Everything ran as expected, it just took about an hour and 15 minutes to spin up the stack. This gave us a nice little cluster with HDFS storage of almost 200TB
The hadoop terasort job was run from the following script
#!/bin/bash
SIZE=10000000000
NUM_MAPS=1500
NUM_REDUCES=1500
IN_DIR=in_dir
OUT_DIR=out_dir
hadoop jar /usr/lib/hadoop/hadoop-examples*.jar teragen -Dmapred.map.tasks=${NUM_MAPS} ${SIZE} ${IN_DIR}
sleep 10
hadoop jar /usr/lib/hadoop/hadoop-examples*.jar terasort -Dmapred.reduce.tasks=${NUM_REDUCES} ${IN_DIR} ${OUT_DIR}
which, with a replfactor of 3, engaged the entire cluster just fine, and ran terasort with no problems
Juju itself seemed to work great in this run, but this brought up a couple of basic optimizations against the EC2 api:
- pass the '-n' options directly to the provisioning agent... don't expand `juju deploy -n <num_units>` and `juju add-unit -n <num_units>` in the client
- pass these along all the way to the ec2 api... don't expand these into multiple api calls
We’ll add those to the list of things to do.
1000 nodes
Onward, upward!
To get around the api throttling, we start up batches of 99 slaves at a time with a 2-minute wait between each batch
#!/bin/bash
juju_env=${1:-"-escale"}
juju_root="/home/ubuntu/scale"
juju_repo="$juju_root/charms"
############################################
timestamp() {
date +"%G-%m-%d-%H%M%S"
}
add_more_units() {
local num_units=$1
local service_name=$2
echo "sleeping"
sleep 120
echo "adding another $num_units units at $(timestamp)"
juju add-unit $juju_env -n $num_units $service_name
}
deploy_slaves() {
local cluster_name=$1
local slave_config="$juju_root/etc/hadoop-slave.yaml"
local slave_size="instance-type=m1.medium"
local slaves_at_a_time=99
#local num_slave_batches=10
juju deploy $juju_env --repository $juju_repo --constraints $slave_size --config $slave_config -n $slaves_at_a_time local:hadoop ${cluster_name}-slave
echo "deployed $slaves_at_a_time slaves"
juju add-relation $juju_env ${cluster_name}-master:namenode ${cluster_name}-slave:datanode
juju add-relation $juju_env ${cluster_name}-master:jobtracker ${cluster_name}-slave:tasktracker
for i in {1..9}; do
add_more_units $slaves_at_a_time ${cluster_name}-slave
echo "deployed $slaves_at_a_time slaves at $(timestamp)"
done
}
deploy_cluster() {
local cluster_name=$1
local master_config="$juju_root/etc/hadoop-master.yaml"
local master_size="instance-type=m1.large"
juju deploy $juju_env --repository $juju_repo --constraints $master_size --config $master_config local:hadoop ${cluster_name}-master
deploy_slaves ${cluster_name}
juju expose $juju_env ${cluster_name}-master
}
main() {
echo "deploying stack at $(timestamp)"
juju bootstrap $juju_env --constraints="instance-type=m1.xlarge"
sleep 120
deploy_cluster hadoop
echo "done at $(timestamp)"
}
main $*
exit 0
We experimented with more clever ways of doing the spinup (too little coffee at this point of the night)… but the real fix is to get juju to take advantage of multiplicity in api calls. Until then, timed batches work just fine.
Juju spun the cluster up in about 2 and a half hours. It had about 380TB of HDFS storage
The terasort job that was run from the script above with
SIZE=10000000000
NUM_MAPS=3000
NUM_REDUCES=3000
eventually completed.
2000 nodes
After the 1000-node run, we chose to clean up from the previous job and just add more nodes to that same cluster.
Again, to get around the api throttling, we added batches of 99 slaves at a time with a 2-minute wait between each batch until we got near 2000 slaves.
This gave us almost 760TB of HDFS storage
and was running fine
but was stopped early b/c waiting for the job to complete would’ve just been silly at this point. With our naive job config, we’re considerably past the point of diminishing returns for adding nodes to the actual terasort, and we’d captured the profiling info we needed at this point.
Juju spun up 1972 slaves in just over seven hours total. Profiling showed that juju was spending a lot of time serializing stuff into zookeeper nodes using yaml. It looks like python’s yaml implementation is python, and not just wrapping libyaml. We tested a smaller run replacing the internal yaml serialization with json.. Wham! two orders of magnitude faster. No particular surprise.
Lessons Learned
Ok, so at the end of the day, what did we learn here?
What we did here is the way developing for performance at scale should be done… start with a naive, flexible approach and then spend time and effort obtaining real profiling information. Follow that with optimization decisions that actually make a difference. Otherwise it’s all just a crapshoot based on where developers think the bottlenecks might be.
Things to do to juju as a result of these tests:
-
streamline our implementation of ‘-n’ options
- the client should pass the multiplicity to the provisioning agent
- the provisioning agent should pass the multiplicity to the EC2 api
-
don’t use yaml to marshall data in and out of zookeeper
-
replace per-instance security groups with per-instance firewalls
What’s Next?
So that’s a big enough bite for one round of scale testing.
Next up:
- land a few of the changes outlined above into trunk. Then, spin up another round of scale tests to look at the numbers.
- more providers (other clouds as well as a MaaS lab too)
- regular scale testing? - can this coincide with upstream scale testing for projects like hadoop?
- test scaling for various services? What does this look like for other stacks of services?
Wishlist
-
find some better test jobs! benchmarks are boring… perhaps we can use this compute time to mine educational data or cure cancer or something?
-
perhaps push juju topology information further into zk leaf nodes? Are there transactional features in more recent versions of zk that we can use?
-
use spot instances on ec2. This is harder because you’ve gotta incorporate price monitoring.
2011-11-22
(cloud, charmschool, juju)
Wanna learn more about juju?
Drop by Charm School:
Details from Jorge’s post:
We're holding a Charm School on IRC.
juju Charm School is a virtual event where a juju expert
is available to answer questions about writing your own
juju charms. The intended audience are people who deploy
software and want to contribute charms to the wider devops
community to make deploying in the public and private
cloud easy.
Attendees are more than welcome to:
Ask questions about juju and charms
Ask for help modifying existing scripts and make charms out of them
Ask for peer review on existing charms you might be working on
Though not required, we recommend that you have juju installed
and configured if you want to get deep into the event.
2011-11-08
(cloud, hadoop, juju)
#########################################################
NOTE: Repost
The ubuntu project “ensemble” is now publicly known as “juju”. This is a repost of an older article Monitoring Hadoop Benchmarks TeraGen/TeraSort with Ganglia to reflect the new names and updates to the api.
#########################################################
Here I’m using new features of Ubuntu Server (namely juju) to easily deploy Ganglia alongside a small Hadoop cluster to play around with monitoring some benchmarks like Terasort.
Short Story
Deploy hadoop and ganglia using juju:
$ juju bootstrap
$ juju deploy --repository "~/charms" local:hadoop-master namenode
$ juju deploy --repository "~/charms" local:ganglia jobmonitor
$ juju deploy --repository "~/charms" local:hadoop-slave datacluster
$ juju add-relation namenode datacluster
$ juju add-relation jobmonitor datacluster
$ for i in {1..6}; do
$ juju add-unit datacluster
$ done
$ juju expose jobmonitor
When all is said and done (and EC2 has caught up), run the jobs
$ juju ssh namenode/0
ubuntu$ sudo -su hdfs
hdfs$ hadoop jar hadoop-*-examples.jar teragen -Dmapred.map.tasks=100 -Dmapred.reduce.tasks=100 100000000 in_dir
hdfs$ hadoop jar hadoop-*-examples.jar terasort -Dmapred.map.tasks=100 -Dmapred.reduce.tasks=100 in_dir out_dir
While these are running, we can run
$ juju status
to get the URL for the jobmonitor ganglia web frontend
http://<jobmonitor-instance-ec2-url>/ganglia/
and see…
and a little later as the jobs run…
Of course, I’m just playing around with ganglia at the moment… For real performance, I’d change my juju config file to choose larger (and ephemeral) EC2 instances instead of the defaults.
A Few Details…
Let’s grab the charms necessary to reproduce this.
First, let’s install juju and set up a our charms.
$ sudo apt-get install juju charm-tools
Note that I’m describing all this using an Ubuntu laptop to run the juju cli because that’s how I roll, but you can certainly use a Mac to drive your Ubuntu services in the cloud. The juju CLI is already available in ports, but I’m not sure the version. Homebrew packages are in the works. Windows should work too, but I don’t have a clue.
$ mkdir -p ~/charms/oneiric
$ cd ~/charms/oneiric
$ charm get hadoop-master
$ charm get hadoop-slave
$ charm get ganglia
That’s about all that’s really necessary to get you up and benchmarking/monitoring.
I’ll do another post on how to adapt your own charms to use monitoring and the monitor juju interface as part of the “Core Infrastructure” series I’m writing for charm developers. I’ll go over the process of what I had to do to get the hadoop-slave service talking to monitoring services like ganglia.
Until then, clone/test/enjoy… or better yet, fork/adapt/use!
2011-11-08
(cloud, juju, hadoop)
#########################################################
NOTE: Repost
The ubuntu project “ensemble” is now publicly known as “juju”. This is a repost of an older article Painless Hadoop / Ubuntu / EC2 to reflect the new names and updates to the api.
#########################################################
Thanks Michael Noll for the posts where I first learned how to do this stuff:
I’d like to run his exact examples, but this time around I’ll use juju for hadoop deployment/management.
The Short Story
Setup
install/configure juju client tools
$ sudo apt-get install juju charm-tools
$ mkdir ~/charms && charm getall ~/charms
run hadoop services with juju
$ juju bootstrap
$ juju deploy --repository ~/charms local:hadoop-master namenode
$ juju deploy --repository ~/charms local:hadoop-slave datanodes
$ juju add-relation namenode datanodes
optionally add datanodes to scale horizontally
$ juju add-unit datanodes
$ juju add-unit datanodes
$ juju add-unit datanodes
(you can add/remove these later too)
Scaling is so easy there’s no point in separate standalone -vs- multinode versions of the setup.
Data and Jobs
Load your data and jars
$ juju ssh namenode/0
ubuntu$ sudo -su hdfs
hdfs$ cd /tmp
hdfs$ wget http://files.markmims.com/gutenberg.tar.bz2
hdfs$ tar xjvf gutenberg.tar.bz2
copy the data into hdfs
hdfs$ hadoop dfs -copyFromLocal /tmp/gutenberg gutenberg
run mapreduce jobs against the dataset
hdfs$ hadoop jar /usr/lib/hadoop-0.20/hadoop-examples.jar wordcount -Dmapred.map.tasks=20 -Dmapred.reduce.tasks=20 gutenberg gutenberg-output
That’s it!
Now, again with some more details…
Installing juju
Install juju client tools onto your local machine…
# sudo apt-get install juju charm-tools
We’ve got the juju CLI in ports now too for Mac clients (Homebrew is in progress).
Now generate your environment settings with
$ juju
and then edit ~/.juju/environments.yaml to use your EC2 keys. It’ll look something like:
environments:
sample:
type: ec2
control-bucket: juju-<hash>
admin-secret: <hash>
access-key: <your ec2 access key>
secret-key: <your ec2 secret key>
default-series: oneiric
In real life you’d probably want to specify default-image-type to at least m1.large too, but I’ll give some examples of that in later posts.
Hadoop
Grab the juju charms
Make a place for charms to live
$ mkdir charms/oneiric
$ cd charms/oneiric
$ charm get hadoop-master
$ charm get hadoop-slave
(optionally, you can charm getall but it’ll take a bit to pull all charms).
Start the Hadoop Services
Spin up a juju environment
$ juju bootstrap
wait a minute or two for EC2 to comply. You’re welcome to watch the water boil with
$ juju status
or even
$ watch -n30 juju status
which’ll give you output like
$ juju status
2011-07-12 15:20:54,978 INFO Connecting to environment.
The authenticity of host 'ec2-50-17-28-19.compute-1.amazonaws.com (50.17.28.19)' can't be established.
RSA key fingerprint is c5:21:62:f0:ac:bd:9c:0f:99:59:12:ec:4d:41:48:c8.
Are you sure you want to continue connecting (yes/no)? yes
machines:
0: {dns-name: ec2-50-17-28-19.compute-1.amazonaws.com, instance-id: i-8bc034ea}
services: {}
2011-07-12 15:21:01,205 INFO 'status' command finished successfully
Next, you need to deploy the hadoop services:
$ juju deploy --repository ~/charms local:hadoop-master namenode
$ juju deploy --repository ~/charms local:hadoop-slave datanodes
now you simply relate the two services:
$ juju add-relation namenode datanodes
Relations are where the juju special sauce is, but more about that in another post.
You can tell everything’s happy when juju status gives you something like (looks a bit different, but basics are the same):
$ juju status
2011-07-12 15:29:20,331 INFO Connecting to environment.
machines:
0: {dns-name: ec2-50-17-28-19.compute-1.amazonaws.com, instance-id: i-8bc034ea}
1: {dns-name: ec2-50-17-0-68.compute-1.amazonaws.com, instance-id: i-4fcf3b2e}
2: {dns-name: ec2-75-101-249-123.compute-1.amazonaws.com, instance-id: i-35cf3b54}
services:
namenode:
formula: local:hadoop-master-1
relations: {hadoop-master: datanodes}
units:
namenode/0:
machine: 1
relations:
hadoop-master: {state: up}
state: started
datanodes:
formula: local:hadoop-slave-1
relations: {hadoop-master: namenode}
units:
datanodes/0:
machine: 2
relations:
hadoop-master: {state: up}
state: started
2011-07-12 15:29:23,685 INFO 'status' command finished successfully
Loading Data
Log into the master node
$ juju ssh namenode/0
and become the hdfs user
ubuntu$ sudo -su hdfs
pull the example data
hdfs$ cd /tmp
hdfs$ wget http://files.markmims.com/gutenberg.tar.bz2
hdfs$ tar xjvf gutenberg.tar.bz2
and copy it into hdfs
hdfs$ hadoop dfs -copyFromLocal /tmp/gutenberg gutenberg
Running Jobs
Similar to above, but now do
hdfs$ hadoop jar /usr/lib/hadoop-0.20/hadoop-examples.jar wordcount gutenberg gutenberg-output
you might want to explicitly call out the number of jobs to use…
hdfs$ hadoop jar /usr/lib/hadoop-0.20/hadoop-examples.jar wordcount -Dmapred.map.tasks=20 -Dmapred.reduce.tasks=20 gutenberg gutenberg-output
depending on the size of the cluster you decide to spin up.
You can look at logs on the slaves by
$ juju ssh datanodes/0
ubuntu$ tail /var/log/hadoop/hadoop-hadoop-datanode*.log
ubuntu$ tail /var/log/hadoop/hadoop-hadoop-tasktracker*.log
similarly for subsequent slave nodes if you’ve spun them up
$ juju ssh datanodes/1
or
$ juju ssh datanodes/2
Horizontal Scaling
To resize your cluster,
$ juju add-unit datanodes
or even
$ for i in {1..10}
$ do
$ juju add-unit datanodes
$ done
Wait for juju status to show everything in a happy state and then run your jobs.
I was able to add slave nodes in the middle of a run… they pick up load and crank.
Check out the juju status output for a simple 10-slave cluster here