Notes from a lunch-and-learn talk. It’s a little weak without the narrative, but I’ll post it here for reference anyway.
Agenda:
- config
- tunnels
- forward
- reverse
- proxies
- ssh + tmux = <3
- ssh, then tmux
- tmux, then ssh
Why SSH?
It’s common practice to secure a cluster of servers using a bastion
host.
This might be a cluster of servers in a colocation facility, containers on a
single host, or instances in an EC2
region… the pattern can still be applied.
The way this works is that the servers in the cluster are all locked down and not accessible to the outside world except where necessary for the production network design of the pipeline or application.
That’s all great for production network traffic. However, there’s often a need
for adhoc access: testing, debugging, monitoring, etc of the cluster. This is
usually access to information that’s required in addition to the existing
monitoring and logging for the production pipeline. Until automated management
solutions involving immutable infrastructure components are widely adopted,
you’ll almost always need the ability for an engineer to directly log into
cluster instances to do things like clear /tmp
directories, run jobs, etc.
You’ve also gotta routinely access various web consoles (ClouderaManager, spark, hdfs, etc) to debug functional or performance problems, to change config, or even just to do sanity checks on overall cluster health.
How do you access all of this? You can’t just expose them to the outside world. None of these consoles were ever designed for that. They’re rife with holes… with often huge ramifications for any incursions! On the other hand, it’s often quite difficult (and dangerous!) to add adhoc network access into production network planning.
Two practices are common:
- VPN access
- SSH proxies and tunnels
They each have pros/cons, tradeoffs between security, ease-of-use, flexibility, and capability. VPN access is often ineffective due to its static nature and sensitivity to all manner of bad security practices. It’s particularly pointless due to the random way different web consoles choose which interfaces they like to bind to. That’s a whole other discussion… for this talk, suffice it to say that I highly recommend and infinitely prefer an SSH-based solution. It’s worth traversing the learning curve of SSH for the sheer power and flexiblity it gives you without compromising security.
Config Files
In your home directory, there’s an optional ~/.ssh/config
file
where you can customize your local SSH client behavior.
You can use this for simple aliases…
#################
# MyBastions
#################
Host customerXbastion
Hostname ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com
Host customerYbastion
Hostname ec2-yyy-yyy-yyy-yyy.compute-1.amazonaws.com
Host customerZbastion
Hostname ec2-zzz-zzz-zzz-zzz.compute-1.amazonaws.com
or adding extra stuff that’s a pain to type every time
############
# CustomerX
############
Host dev-control-*.customerX.com
User ubuntu
IdentityFile ~/projects/customerX/creds/dev_control.pem
Host dev-es-*.customerX.com
User ubuntu
IdentityFile ~/projects/customerX/creds/dev_es.pem
Host dev-hdp-*.customerX.com
User ubuntu
IdentityFile ~/projects/customerX/creds/dev_hdp.pem
(etc)
notice the pattern entries?
You can include tunnels (discussed below)
Host myserver
Hostname 10.2.3.4
LocalForward 7080 localhost:7080
LocalForward 8080 localhost:8080
or proxies (also discussed below)
#############
# CustomerY
#############
Host customerYbastion
Hostname ec2-yyy-yyy-yyy-yyy.compute-1.amazonaws.com
User ubuntu
ProxyCommand none
Host *.inside.customerY.com
User ubuntu
ProxyCommand ssh customerYbastion nc -q0 %h %p
Once you add multiple cluster configs and different customer environments, these SSH config files can get quite complex. Here’re a couple of ways I’ve seen people manage that:
-
just manage one big
~/.ssh/config
file by hand and useHost
names and comments to keep track of everything -
explictly specify config files at the command line a la
ssh -F ~/.ssh/customerX-config <server>
… maybe even use a shell alias to shorten this if you do it a lot -
[what I currently do] scripts to glue multiple config snippets from
~/.ssh/config.d/customerX.conf
into a single big read-only~/.ssh/config
. It’d be nice to eventually change the ssh client to optionally read from these kind of~/.ssh/config.d/
and~/.ssh/authorized_keys.d/
snippet directories -
customer-specific containers… I actually work a lot from inside of containers on an ec2 instance. I usually have them just bind-mount the underlying hosts home directory, but you could easily keep them isolated with separate config and spin them up only when you need overlay specific to a customer. This also works even with gui apps on a laptop btw, but that’s a longer story :)
It’s also pretty common for folks to write scripts using config management
(juju, knife, or ClouderaManager-like APIs) to generate ssh config snippets
from a running infrastructure. This can be quite useful, but is still a static
picture of a cluster that changes. Depending on the lifetime or stability of
the cluster, you’re often better off using a more dynamic approach like knife
ssh
. It’s a no-win tradeoff of sharing static SSH config snippets -vs-
configuring chef environments for everyone who needs to access the cluster.
I’d love to hear other solutions folks have come up with to deal with this. I have no clue what puppet offers here, and I bet there are great examples of ansible’s ec2 plugin that’ll be a dead-simple way to interact with a dynamic host inventory. Perhaps that’s where I’ll head next… we’ll see. Totally depends on customer environments.
Proxies
One server, a bastion
host, accepts SSH traffic from the outside world.
Remaining target
hosts in the cluster are configured internal access only.
Consider the following scenario using a ProxyCommand
.
Take an externally accessible bastion
and an internally accessible target
.
Set up your SSH config so you can ssh directly to the bastion
host
+--------------------+ +-------------------+
| | | |
| | | |
| | | |
| | | |
| | | |
| laptop | | bastion |
| | ssh | |
| +---------> +
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
+--------------------+ +-------------------+
with a command like
`ssh bastion`
Then you can ssh from there to a target
host
+-------------------+ +-------------------+
| | | |
| | | |
| | | |
| | | |
| | | |
| bastion | | target |
| | ssh | |
| +----------> |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
+-------------------+ +-------------------+
`ssh target`
The key bit here is that we can compress this to one step for the user.
+--------------------+ +-------------------+ +-------------------+
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| laptop | | bastion | | target |
| | ssh | | ssh | |
| +---------> +-------> |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
+--------------------+ +-------------------+ +-------------------+
From laptop’s ~/.ssh/config
file:
Host bastion
Hostname ec2-xxx.xxx.....amazon.com
Host target
Hostname ip-10-xx-xx-xx.internal....amazon.com
ProxyCommand ssh bastion nc -q1 %h %p
then you can just ssh target
directly from your laptop. It automatically
traverses the proxy bastion
on your behalf.
Note, that from an administrative perspective, it’s easy to control user access
at the single bastion
… if you can’t establish an ssh connection to the
bastion, you can’t “jump through it” to internal hosts.
Tunnels
SSH in general is a tunnel
+-----------------------+ +------------------------+
| | | |
| | | |
| | Inet | |
| | | |
| +-----------------------------> |
| + <-- text --> | fred |
| + | |
| laptop +-----------------------------> (ec2 instance) |
| | | |
| | | (any remote server) |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | +------------------------+
+-----------------------+
`ssh fred`
forward tunnels
aka, “port forwarding”
forwarding web traffic
+----------------+ +------------------------+
| | | |
| | | |
| | | |
-- 8888 ->| | | |
| +------------------> |
| + <-- text --> | |
| + <-- web --> | | -----> http://nfl.com/
| laptop +------------------> ec2 instance |
| | | |
| | | (any remote server) |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | +------------------------+
+----------------+
`ssh fred -L8888:www.nfl.com:80`
`open http://localhost:8888/`
forwarding localhost
+------------+ +------------------------+
| | | |
| | | |
-- 50070->| | | |
| | | | http://... <---+
| | | | (50070) |
| +----------------> | |
| + <-----> | | |
| + <-----> | | ----------------+
| laptop +----------------> ec2 instance |
| | | |
| | | (any remote server) |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | +------------------------+
+------------+
`ssh fred -L8888:localhost:80`
or, perhaps more useful…
`ssh fred -L50070:localhost:50070`
`open http://localhost:50070/`
or
`ssh fred -L50070:localhost:50070 -L50030:localhost:50030`
reverse tunnels
+-----------------------+ +------------------------+
| | | |
| | -- 2222 ->| |
| | | |
| | | |
| | | |
| +-----------------------------> |
| + <-----> | |
| + <-----> | |
| laptop +-----------------------------> ec2 instance |
| | | |
| | | (any remote server) |
| | | |
| | 22 <---+ | |
| | | | |
| | | | |
| |--------+ | |
| | | |
| | +------------------------+
+-----------------------+
`ssh fred -R2222:localhost:22`
or maybe something like…
`ssh fred -R8888:localhost:80`
or even ssh root@fred -R80:localhost:80
add tunnels to your ssh config
Host myhost
Hostname 10.1.2.3
LocalForward 7080 localhost:7080
LocalForward 8080 localhost:8080
If you have any questions or feedback, please feel free to share it with me on Twitter: @m_3