Monday, June 30, 2014

Opening and closing ports on iptables

It turns out this is pretty easy.  The insert and append commands make it easy.

iptables works by setting up chains of filters for certain types of requests.  To see all your chains, type:

# This gives you a verbose list of the rules, with numerical displays for port numbers
iptables -L -v -n

You may find that some chains feed into other chains.  Understanding the flow of how iptables handles letting packets through is the first step toward getting your rules in the right place.  There are 3 predefined chains (INPUT, FORWARD, and OUTPUT).  These are the starting points for processing any network traffic handled by iptables. 

iptables works by running rules against packets in order until it finds one that matches.  When it finds a rules that matches, it applies the relevant action to the packet, which could be accepting, dropping, rejecting, forwarding, or any of a number of other actions.

The rules are processed by order in their chains, so order matters.  Often a chain will end with a line that looks like this:

REJECT     all  --  anywhere             anywhere            reject-with icmp-host-prohibited

This rule rejects all traffic on all ports.  This is a common way to handle whitelisting only approved activities and rejecting everything else.  For your rule to take effect, it has to come before this rule in the chain.

Once you understand this, the value of the insert command makes more sense.  You need to get your rule into the appropriate place in the chain.  An example of opening port 8080 is below.  In this example I'm adding the rule to a specific chain (RH-Firewall-1-INPUT) that is handling all packets routed through the default INPUT and FORWARD chains.

# The rule closing all ports was previously in the 16th spot in the chain
# This new rule opens port 8080 by putting a rule right before that "catch all" exclusion rule
iptables -I RH-Firewall-1-INPUT 16 -m state --state NEW -p tcp --dport 8080 -j ACCEPT

If you happen to make a mistake, you can easily delete a rule at a specific point in your chain with the following.

# Deletes rule 12 in chain RH-Firewall-1-INPUT
iptables -D RH-Firewall-1-INPUT 12

It's usually good to add a line blocking all unspecified traffic at the end of your config file.

 # Reject all traffic not explicitly allowed in previous rules
iptables -A RH-Firewall-1-INPUT -p all -j REJECT

There is a lot more you can do with iptables, but hopefully this was a helpful starting point.

BONUS

If you're working a VM in VirtualBox, you can edit the port forwarding rules and they will take effect without having to reboot the VM.

DOUBLE BONUS

When trying to make sure a network service is working, here are a few good steps that I found to minimize frustration:
  • Turn off selinux (sudo setenforce 0)
  • Turn off iptables (service iptables stop)
  • Use nmap to scan open ports (nmap -sS -O 127.0.0.1)
  • Use curl to make sure you can access the service locally (applies to HTTP services only)
Once you can get to the service from the inside, gradually start turning service back on until something breaks.  Then fix it.



Wednesday, June 25, 2014

Hosting multiple versions of a django site on subdomains with apache/modwsgi

I've run into a couple scenarios recently where customers want to have access to multiple versions of a site at the same time.

Why multiple versions

The first scenario involved an analysis application where a bit of simulation code changed.  In that case we were fairly sure that the customer would want to use the updated model, but we wanted to provide access to both versions so they could do some comparisons.

The second scenario involved an application that pulled data from a remote database, cleaned it up, and provided an interface for browsing the data.  The format of the data in the remote database changed, but the customer wanted to be able to still connect and update from tables containing data in the old format as well as the new format.

For both of these situations, it would have been possible to edit the user interface and the backend code to allow access to both versions of the application at the same time, but this would have made for more confusing interfaces and a much more complex codebase.

Why subdomains

There are 2 main ways to handle serving 2 versions at the same time: 1) using different ports 2) using subdomains.  Each method has its upsides and downsides.

If you serve off multiple ports, you first have to open another port in your firewall.  For many applications (esp. those sitting in a customer testbed) this isn't a big deal.  In my group's situation, we deploy into some pretty tightly monitored environments, and minimizing the number of open ports makes certification and approval a simpler process.  Also, serving off of multiple ports makes just makes for less pretty urls.  "simv2test.myapplication.com" is just cleaner and more self documenting than "myapplication.com:8080".

If you choose to work with subdomains, you'll need a wildcard DNS record to be able to grab all the traffic to your site.  Some hosts and DNS services provide this automatically, and some make you pay more for that service.  Also, if you're serving over SSL, you'll need a wildcard SSL certificate.  This may also cost a bit more than a normal single domain certificate.

After considering both options, we decided subdomains made more sense in each of the scenarios described above.

How

First consider your data.  In both of the situations described above we were serving up 2 different versions of the code and the data.  We handled this by copying our database into a new database, and pointing the new fork of the application at this new database.

In mysql the command was as simple as this (after creating the "appname_2013" database):

sudo mysqldump appname | sudo mysql appname_simv2test

Mongo has a command specifically for this, which can be evoked from the mongo shell.

Next, setup the application code.  The code for one of the projects was being served from "/var/www/deploy/appname/", so I copied the new version of the code to "/var/www/deploy/appname_simv2test".  Make sure to make the necessary permission changed to the files and directories.  I found that writing a fabric task to deploy each version of the application made this much easier.

Finally, setup your apache configuration to serve up each version of the application at the appropriate subdomains.  Something like the following should work ok.



You probably don't want to do this with too many subdomains on one server, because each subdomain is basically doubling the amount of resources running on your computers (2x number of application threads, 2x number of database tables).

But for a simple temporary solution, that should do it.

BONUS

Here's a version that deploys over ports instead of subdomains.  One of the ports is served over https (port 80 redirected to 443), and the other is just over http on port 8080.


Monday, June 23, 2014

Getting django LiveServerTestCase working with selenium's remote webdriver on VirtualBox

Our group does our development on linux VMs, usually running on a Windows host.  We want our developers to be able to write selenium system tests to wrap some of our existing functionality before we start diving into some deep refactoring.

Most of the LiveServerTestCase documentation I have seen is for the case of django running locally and talking to selenium directly.  Getting an instance of django running a VM working with selenium running on the VM host required a few adjustments.

Start with a modern Django

LiveServerTestCase was introduced in Django 1.4.  We were on Django 1.3.  I tried using django-selenium, but had significant problems with their built in test server implementation not starting, not stopping, or crashing in strange ways.

I ended up upgrading our project from django 1.3 to 1.6.5.  For our large project this just took ~2 hours of fiddling.

Open/forward required ports

It's probably best to just turn off the iptables service when setting things up.  If you have selinux running, set it in permissive mode.

Add a line in your settings.py file to configure the port to use for the testserver used by LiveServerTestCase.

os.environ['DJANGO_LIVE_TEST_SERVER_ADDRESS'] = '0.0.0.0:8008'

It's important that you use '0.0.0.0' and not 'localhost' so that the port forwarding on VirtualBox works.  I'm using 8008 because it is one of the auxiliary http ports recognized by the selinux default configuration.

Then edit the settings of the VM in VirtualBox to forward port 8008 to some unused port on your local machine.  We're forwarding port 80 on the VM to 8888, so I forwarded this test port to 8889.

Serve up static files

We have apache serving static files at /static_assets/.

The test server is a python server, so we had to configure it to find and serve these static files.  In a test-specific settings file, I added:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': 'fact_rdb',
    }
}

if settings.DATABASES['default']['ENGINE'] == 'django.db.backends.sqlite3':
    import os
    from django.conf.urls.static import static
    STATIC_ASSETS_URL = "/static_assets/"
    STATIC_ASSETS_ROOT = os.path.join(settings.PROJECT_PATH, 'static_assets')
    LOGIN_REQUIRED_URLS_EXCEPTIONS = tuple(list(LOGIN_REQUIRED_URLS_EXCEPTIONS) + [r'^/static_assets/.*$'])
    urlpatterns += patterns('',
        url(r'^static_assets/.*$', 'myapp.views.serve_static_assets', name="static_assets_for_testing"),
    )

The last line is the important one.  It redirects to a view described below.

The lines where I an editing "LOGIN_REQUIRED_URLS_EXCEPTIONS" exist because we are using a middleware to handle restricting access to certain urls.  You can see what that middleware looks like here.  Remove that line if you're not using that middleware.

I'm also configuring the tests to use an in-memory sqlite database.  I recommend that you use this if possible.  If you are using custom sql in your code, this may not be possible, but if you're justing using the ORM it should work just fine.  Running with an in memory database (in my experience) seems to take a couple seconds off the execution time of every test in your test suite.

The view that ties into the url in the code above and serves up the static assets is as follows:

import os
from mimetypes import guess_type
from django.http import HttpResponse
from django.core.servers.basehttp import FileWrapper

def serve_static_assets(request):
    # Take a url like /static_assets/path/to/file.js and create a path
    filename = "/path/to/static/dir" + request.path_info
    static_file = FileWrapper(open(filename, 'rb'))
    mimetype = guess_type(request.path_info, False)[0] or 'binary/octet-stream'
    response = HttpResponse(static_file, mimetype=mimetype)
    response['Content-Length'] = os.path.getsize(filename)
    return response

WARNING: This is not a secure way to serve static files.  Please do not use this for anything but testing.


Add setUpClass and tearDownClass methods to your test classes

The following sets up a web driver (available at "self.driver" in your test functions) connected to the driver on the host machine.

SELENIUM_HOST = '10.0.2.2'
SELENIUM_PORT = 4444

class myTest(LiveServerTestCase):

    @classmethod
    def setUpClass(cls):
        cls.driver = webdriver.Remote(
            command_executor='http://%s:%s/wd/hub' %(SELENIUM_HOST, SELENIUM_PORT),
            desired_capabilities=DesiredCapabilities.CHROME)
        super(LoginTest, cls).setUpClass()

    @classmethod
    def tearDownClass(cls):
        cls.driver.quit()
        super(LoginTest, cls).tearDownClass()

To connect to a server running on the host machine, use the following settings.

SELENIUM_HOST = '10.0.2.2'
SELENIUM_PORT = 4444

On virtualbox, the IP of the host machine is usually 10.0.2.2.

Running your tests

You'll need to first start a selenium server running on the host machine.  If the selenium driver server isn't running there will be nothing for your selenium test runner to talk to.

To do this you'll need java and selenium installed.  Also, add the executables for any desired driver plugins (e.g. the chrome driver plugin) on your path.  Run the server with something like:

"C:\Program Files (x86)\Java\jre7\bin\java.exe" -jar selenium-server-standalone-2.33.0.jar

On the VM, run the command to test your application.  Something like:

python manage.py test --settings=custom_settings_file module.class.test_function

I hope that helped!


Wednesday, June 18, 2014

Decompiling python bytecode with pycdc

We somehow lost the correct working version of a python file for a project, but one of our servers still had the pyc file (which was working fine in production).  To fix this,  went hunting for a good solution to get back our sourcecode.

From what I found, it seems the pycdc library is the best option currently, though there is also:



When I tried unpyc it threw and error for me, and uncomplye2 only works with 2.7.

Here are the steps to setup pycdc.  These instructions are for Centos 5.3, so they may need to be tweaked for your system.

Install CMake


wget http://www.cmake.org/files/v2.8/cmake-2.8.12.2.tar.gz
tar xzvf cmake-2.8.12.2.tar.gz
cd cmake-2.8.10.2
./bootstrap
make
make install

Download and compile pycdc

git clone git@github.com:zrax/pycdc.git
cd pycdc
/usr/local/bin/cmake ../pycdc/
make

Using pycdc to decompile

The program outputs to stdout, so redirect to a file.

./pycdc/pycdc filename.pyc > filename.py

That's it.