Configuring Betamax¶
By now you’ve seen examples where we pass a great deal of keyword arguments to
use_cassette()
. You have also seen that we used
betamax.Betamax.configure()
. In this section, we’ll go into a deep
description of the different approaches and why you might pick one over the
other.
Global Configuration¶
Admittedly, I am not too proud of my decision to borrow this design from VCR, but I did and I use it and it isn’t entirely terrible. (Note: I do hope to come up with an elegant way to redesign it for v1.0.0 but that’s a long way off.)
The best way to configure Betamax globally is by using
betamax.Betamax.configure()
. This returns a
betamax.configure.Configuration
instance. This instance can be used
as a context manager in order to make the usage look more like VCR’s way of
configuring the library. For example, in VCR, you might do
VCR.configure do |config|
config.cassette_library_dir = 'examples/cassettes'
config.default_cassette_options[:record] = :none
# ...
end
Where as with Betamax you might do
from betamax import Betamax
with Betamax.configure() as config:
config.cassette_library_dir = 'examples/cassettes'
config.default_cassette_options['record_mode'] = 'none'
Alternatively, since the object returned is really just an object and does not do anything special as a context manager, you could just as easily do
from betamax import Betamax
config = Betamax.configure()
config.cassette_library_dir = 'examples/cassettes'
config.default_cassette_options['record_mode'] = 'none'
We’ll now move on to specific use-cases when configuring Betamax. We’ll
exclude the portion of each example where we create a
Configuration
instance.
Setting the Directory in which Betamax Should Store Cassette Files¶
Each and every time we use Betamax we need to tell it where to store (and
discover) cassette files. By default we do this by setting the
cassette_library_dir
attribute on our config
object, e.g.,
config.cassette_library_dir = 'tests/integration/cassettes'
Note that these paths are relative to what Python thinks is the current working directory. Wherever you run your tests from, write the path to be relative to that directory.
Setting Default Cassette Options¶
Cassettes have default options used by Betamax if none are set. For example,
The default record mode is
once
.The default matchers used are
method
anduri
.Cassettes do not preserve the exact body bytes by default.
These can all be configured as you please. For example, if you want to change the default matchers and preserve exact body bytes, you would do
config.default_cassette_options['match_requests_on'] = [
'method',
'uri',
'headers',
]
config.preserve_exact_body_bytes = True
Filtering Sensitive Data¶
It’s unlikely that you’ll want to record an interaction that will not require authentication. For this we can define placeholders in our cassettes. Let’s use a very real example.
Let’s say that you want to get your user data from GitHub using Requests. You might have code that looks like this:
def me(username, password, session):
r = session.get('https://api.github.com/user', auth=(username, password))
r.raise_for_status()
return r.json()
You would test this something like:
import os
import betamax
import requests
from my_module import me
session = requests.Session()
recorder = betamax.Betamax(session)
username = os.environ.get('USERNAME', 'testuser')
password = os.environ.get('PASSWORD', 'testpassword')
with recorder.use_cassette('test-me'):
json = me(username, password, session)
# assertions about the JSON returned
The problem is that now your username and password will be recorded in the cassette which you don’t then want to push to your version control. How can we prevent that from happening?
import base64
username = os.environ.get('USERNAME', 'testuser')
password = os.environ.get('PASSWORD', 'testpassword')
config.define_cassette_placeholder(
'<GITHUB-AUTH>',
base64.b64encode(
'{0}:{1}'.format(username, password).encode('utf-8')
)
)
Note
Obviously you can refactor this a bit so you can pull those environment variables out in only one place, but I’d rather be clear than not here.
The first time you run the test script you would invoke your tests like so:
$ USERNAME='my-real-username' PASSWORD='supersecretep@55w0rd' \
python test_script.py
Future runs of the script could simply be run without those environment variables, e.g.,
$ python test_script.py
This means that you can run these tests on a service like Travis-CI without providing credentials.
In the event that you can not anticipate what you will need to filter out,
version 0.7.0 of Betamax adds before_record
and before_playback
hooks.
These two hooks both will pass the
Interaction
and
Cassette
to the function provided. An
example callback would look like:
def hook(interaction, cassette):
pass
You would then register this callback:
# Either
config.before_record(callback=hook)
# Or
config.before_playback(callback=hook)
You can register callables for both hooks. If you wish to ignore an
interaction and prevent it from being recorded or replayed, you can call the
ignore()
. You also have full
access to all of the methods and attributes on an instance of an Interaction.
This will allow you to inspect the response produced by the interaction and
then modify it. Let’s say, for example, that you are talking to an API that
grants authorization tokens on a specific request. In this example, you might
authenticate initially using a username and password and then use a token
after authenticating. You want, however, for the token to be kept secret. In
that case you might configure Betamax to replace the username and password,
e.g.,
config.define_cassette_placeholder('<USERNAME>', username)
config.define_cassette_placeholder('<PASSWORD>', password)
And you would also write a function that, prior to recording, finds the token, saves it, and obscures it from the recorded version of the cassette:
from betamax.cassette import cassette
def sanitize_token(interaction, current_cassette):
# Exit early if the request did not return 200 OK because that's the
# only time we want to look for Authorization-Token headers
if interaction.data['response']['status']['code'] != 200:
return
headers = interaction.data['response']['headers']
token = headers.get('Authorization-Token')
# If there was no token header in the response, exit
if token is None:
return
# Otherwise, create a new placeholder so that when cassette is saved,
# Betamax will replace the token with our placeholder.
current_cassette.placeholders.append(
cassette.Placeholder(placeholder='<AUTH_TOKEN>', replace=token)
)
This will dynamically create a placeholder for that cassette only. Once we have our hook, we need merely register it like so:
config.before_record(callback=sanitize_token)
And we no longer need to worry about leaking sensitive data.
Setting default serializer¶
If you want to use a specific serializer for every cassette, you can set
serialize_with
as a default cassette option. For example, if you wanted to
use the prettyjson
serializer for every cassette you would do:
config.default_cassette_options['serialize_with'] = 'prettyjson'
Per-Use Configuration¶
Each time you create a Betamax
instance or use
use_cassette()
, you can pass some of the options from
above.
Setting the Directory in which Betamax Should Store Cassette Files¶
When using per-use configuration of Betamax, you can specify the cassette
directory when you instantiate a Betamax
object:
session = requests.Session()
recorder = betamax.Betamax(session,
cassette_library_dir='tests/cassettes/')
Setting Default Cassette Options¶
You can also set default cassette options when instantiating a
Betamax
object:
session = requests.Session()
recorder = betamax.Betamax(session, default_cassette_options={
'record_mode': 'once',
'match_requests_on': ['method', 'uri', 'headers'],
'preserve_exact_body_bytes': True
})
You can also set the above when calling use_cassette()
:
session = requests.Session()
recorder = betamax.Betamax(session)
with recorder.use_cassette('cassette-name',
preserve_exact_body_bytes=True,
match_requests_on=['method', 'uri', 'headers'],
record='once'):
session.get('https://httpbin.org/get')
Filtering Sensitive Data¶
Filtering sensitive data on a per-usage basis is the only difficult (or perhaps, less convenient) case. Cassette placeholders are part of the default cassette options, so we’ll set this value similarly to how we set the other default cassette options, the catch is that placeholders have a specific structure. Placeholders are stored as a list of dictionaries. Let’s use our example above and convert it.
import base64
username = os.environ.get('USERNAME', 'testuser')
password = os.environ.get('PASSWORD', 'testpassword')
session = requests.Session()
recorder = betamax.Betamax(session, default_cassette_options={
'placeholders': [{
'placeholder': '<GITHUB-AUTH>',
'replace': base64.b64encode(
'{0}:{1}'.format(username, password).encode('utf-8')
),
}]
})
Note that what we passed as our first argument is assigned to the
'placeholder'
key while the value we’re replacing is assigned to the
'replace'
key.
This isn’t the typical way that people filter sensitive data because they tend to want to do it globally.
Mixing and Matching¶
It’s not uncommon to mix and match configuration methodologies. I do this in
github3.py. I use global configuration to filter sensitive data and set
defaults based on the environment the tests are running in. On Travis-CI, the
record mode is set to 'none'
. I also set how we match requests and when we
preserve exact body bytes on a per-use basis.