A taste of data science with Auto Trader

We were delighted to welcome Auto Trader to Alan Turing last Wednesday for an industry problem solving event focused on data analysis.

If you’ve ever looked at buying a car online (or, like me, enjoy a bit of fantasy car shopping)  you’ll doubtless have come across Auto Trader’s website; it’s the UK’s largest digital automotive marketplace with around 48 million visits a month. Combine this site traffic with a plethora of search filters and you get a huge amount of data to deal with. What can maths tell us about users’ search habits and the cars they’re buying?

Cue Auto Trader data scientist Dr Peter Appleby, who presented us with two problems that reflect the mathematical challenges the company faces, each focusing on a different technique for data analysis. The first was using regression on historic car prices and mileage to create a model for valuation. The second looked at applying clustering to try and understand the search space generated by users using the different filters- namely, which combination of filters (make, model, fuel etc.) are more common than others?

After a quick crash course (pun intended) on regression and clustering, we headed over to the computer cluster to wrangle with some real-life datasets. Of course, there’s only so much you can do in an hour, but there was plenty of discussion and coding going on as people tried to extract useful insights from the data. The clustering problem proved most popular -the conclusion from the session seemed to be that users who search for Audis also tend to look for BMWs and other pricier makes. Not exactly the most ground breaking conclusion you might say, but one can imagine extending this method to a complex multidimensional search space, where much more interesting patterns might start to emerge.

An example of k-means clustering for a dataset using two nodes. Source: Wikicommons

In all, it was a fun day for both those familiar with these techniques and those who were trying out data analysis for the first time. As one undergraduate attendee told me over coffee at the close of the event, if didn’t matter that he wasn’t familiar with the techniques beforehand “it’s fun just to come along and have a go, mess around with different models- you’re learning something new.” For the more experienced, it was “motivating to get to work with a real dataset, rather than making up your own.”

For my part, I wouldn’t have necessarily realised the extent to which maths is used behind the scenes at somewhere like Auto Trader – it’s really interesting to see how maths and data-driven methods are becoming an increasingly important tool for companies in the digital age. I would like to offer my personal thanks to Dr. Appleby and Auto Trader for their support for this event, and I look forward to organising the next industry problem solving event!

math.git version control seminar

Many thanks to Weijian Zhang for his excellent introduction to version control with Git. The room was almost packed out- whether this was entirely owing to interest in version control, or the pile of chocolate biscuits on offer, I couldn’t possibly say- but there was a great informal, interactive atmosphere in the seminar. Most of the room (myself included) were trying out Git for the first time on their laptops, with Weijian and more experienced users in the room fielding questions.

Created by Linux godfather Linus Torvald (apparently over the course of a weekend in response to the loss of access to BitKeeper), Git is now one of the most widely used systems for software version management, as well as being used for countless personal coding projects. Definitely a useful tool for the modern mathematician to have, if not the most glamorous. Speaking to attendees afterwards, however,  I found lots of people enthusiastic to go away and start introducing Git to their workflow- we might even run a follow up seminar! Who knew there was such enthusiasm for version control in Alan Turing.

In the meantime, if you’re looking to learn more about Git, head to Github which has full documentation and example projects to help familiarise you with the language. Check out Bitbucket too as an alternative place to set up a remote repository for your projects. The University of Manchester also offers training on Git to staff and PhD students as part of its Research IT support programme; details here (University of Manchester login required).

We’ll be announcing the next math.seminar soon, on creating vector graphics with Inkscape, so watch this space!

 

 

New(ish) math.seminar series!

Following on from a successful workshop on Julia (see Chapter Secretary Matthew Gwynne’s  blog post for SIAM UKIE), we’re organising a new series of seminars to promote the kinds of computer skills which are useful for any mathematician, but don’t necessarily involve actual maths!

Have you been meaning to set up a personal website to promote your research? Looking for something better than Microsoft Paint to make your diagrams? Need to brush up your LaTeX? Then the “maths dot” series is for you!

Each seminar is aimed at beginners – no previous experience necessary! – and is accompanied by tea, coffee and biscuits. It’s best if you can bring your laptop with you, but feel free to come along and watch if not.

We’ve already had our first successful seminar math.html on the basics of HTML and setting up a personal webpage, given by our very own Webmaster Jonathan Deakin (blog post to follow!).

Upcoming math.seminars:

 

math.git
4 – 5 pm, Thursday 16th February 2017
Frank Adams 1, Alan Turing Building
Weijian Zhang

If you’ve ever found yourself using filenames like “project_final_draft_3_March12_DEFINITELYFINAL_v2”, then you need Git! Learn to keep track of your changes to your thesis, paper, source code etc. by using Git for version control.

math.inkscape
Date & time TBC
Georgia Lynott

Want to fill your papers and presentations with beautiful, non-fuzzy graphics? Inkscape is a free vector graphics program with lots of excellent features which will allow you to make high-quality diagrams and graphics.

Have you got an idea for another maths.seminar? If you’re interested in sharing your computing expertise, please do contact the committee!