Home MTA Technology A crowdsourcing solution to subway tracking

A crowdsourcing solution to subway tracking

by Benjamin Kabak

A few weeks ago, Alex Bell, the brother of a long-time classmate and baseball teammate of mine, e-mailed me about a project he’s working on. A lifelong New Yorker, Bell wanted to use something prevalent — smartphones — to track subway trains. Instead of a complicated and expensive technology solution to real-time train location systems, Bell believes he can, in his words, “use the sensing capabilities of the smartphone to map the current location of all the NYC subways in real time.”

Enter the Subway Arrival Mobile Phone application. By using passive data and Base Station ID numbers, those who opt into Bell’s project can assist in providing essentially background data on train location. Bell pondered those who know where a train is when and determined that the short list includes the people on the train, the conductor, the MTA control room and, most importantly, those who have just gotten off the train.

“In a perfect world,” Bell said to me, “all I would have had to do was write a simple application which let users broadcast that they had left such and such train heading downtown. But of course no busy New Yorkers would have engaged in that, which is called participatory sensing. Instead I had to devise a method of using opportunistic sensing — sensing that doesn’t involve the user.”

To accomplish this, Bell focused on the BaseStation Identification number, which provides information on which cell tower is sending and receiving data to that phone. The application, he says, looks for periods of “no reception underground coupled with a large distance change.”

“By using some signal processing algorithms the false positives are weeded out and the results are sent, anonymously, to a server,” he explained. “I have optimized the application to have almost negligible effect on battery life. And by keeping the data anonymous, the security concerns are removed.”

This is crowdsourcing at its finest, and it just might work. To allow for differences in, say, walking speed or mobile signal pick-up, the application relies on a hidden Markov model to adjust the time variables. Of course, though, the app still relies on a critical mass of people to make it work. “As I have constructed it, when more and more users download the free application their phones will start reporting the location of the trains. The annotated Google Map, on both the phone and the website, will show icons which indicate which train and traveling in which direction,” Bell notes. “I may configure the server to also approximate arrival times; however, i think it may be more intuitive to see the icon approaching your station. The biggest IF I see is that if not enough users download the app then the data will be spotty and incomplete. Then users will delete the app before it had reached the critical mass of people to make it truly useful.”

Recently, Bell has updated the map to include confidence intervals based upon the amount of data he has collected, and he is working on building an API so other developers can take advantage of the platform. Currently, Bell is maintaining a Twitter feed and is part of the NYC Big Apps competition. It’s certainly a solid idea behind a problem that has plagued the city’s subway system for years.

You may also like


Donald February 21, 2011 - 2:52 pm

This idea will never be implemented because it makes too much sense and is too simple. Not only will it never be implemented, I would not be surpised if the legal dept. sends him a cease and desist letter.

Benjamin Kabak February 21, 2011 - 2:55 pm

Why exactly would the legal department send him a letter when the MTA has been encouraging app development for over a year now? I think you’ve missed the extensive posting done on that. This doesn’t need MTA blessing to work. It’s an open-source solution that requires no commitment from them. The “too much sense/too simple” charge just isn’t applicable to the context of this app.

Donald February 21, 2011 - 3:14 pm

I remember readng a while ago about an app developer who made maps getting sued for royalties by the MTA. If things have changed, then I was not aware (I don’t have any type of Smart Phone).

Matt January 10, 2012 - 10:53 am

I remember reading that the MTA purposefully opened up all of their train status information via RSS and RESTful APIs, as well as high res maps to the public.

Then I remember reading a subway advert that said “Our apps are whiz kid certified. Instead of developing transit apps, we gave our info to the people who do it best. Search the web for ‘NY transit apps’ to see what we mean.” (http://www.observer.com/2010/m.....didnt-make)

But then again, the legal dept might sue themselves.

Joseph February 21, 2011 - 2:57 pm

I am having trouble finding this in the app store. When I search SubwayArrival nothing shows up

Benjamin Kabak February 21, 2011 - 3:09 pm

Try this link. That’s what Alex says is the direct link to the app store’s page.

Kevin February 21, 2011 - 4:38 pm

It would be more useful if it used GPS, WiFi that way the users of the iPod Touches can also be covered with this application. One thing I’m wondering is how does it detect which train you got off of, since our stations generally have more than 1 train running through.

Also many routes run elevated, or through open cuts. If the smartphone can pick up a signal here (and they should be able to), then a much more accurate location of the train can be detected. Using this data, the server holding this can predict the times for the underground stations, and then using what the program currently does to verify the prediction.

Anon256 February 21, 2011 - 5:00 pm

There’s a post on the app’s blog explaining to some degree how it works (or will work when there are more users) at multiline stations, using among other things where you got on and what projections from other users’ data imply is arriving. The elevated lines question is less clear; at the moment it doesn’t seem to have data on those at all. Continuous tracking might drain battery too much, but a phone could still report in if it observes you moving fast along a subway line and then stopping. I don’t see any way to distinguish people on trains from people in buses and cars on the street below, though.

al February 21, 2011 - 5:52 pm

There is a depiction of the system.


It seems to need long underground sections (or far more users) for the system’s elevated lines to work. The 7 and N are not visible. The IRT New Lots elevated section, Franklin St Shuttle, and Rockaway Park service (,S) are not visible. Part of the R is also not visible. Its better than the existing TA setup for most of the system, but requires work.

al February 21, 2011 - 5:53 pm

dang it:

The IRT New Lots elevated section, Franklin St Shuttle, and Rockaway Park service (A,S) are not visible. Part of the R is also not visible. Its better than the existing TA setup for most of the system, but requires work.

Adirondacker12800 February 21, 2011 - 6:06 pm

GPS doesn’t work underground. Doesn’t work well in Manhattan generally, not enough satellites in view.

al February 21, 2011 - 5:39 pm

How does this system work with subway lines on elevated, bridge, embankment, surface, open cut sections? There are also several underground subway stations that receive moderate quality cell service.

The MTA is starting to install cell service in the stations. This might take a while to reach all underground stations, but may do away with train location confidence issues.

A variation on this is for the TA equip the stations with cell service, and issue all motormen with a cell phone with this app or a modified version of. The servers would pick out the designated phones and their locations. This would require some interaction with the TA. Buses and trains running with constant cell service can be tracked via cellular triangulation, or a bit more crudely, base station location changes.

SubwayArrival February 21, 2011 - 7:42 pm

Hey everyone great comments and questions,
I will try to answer some of the great questions above. Email me at subwayarrival at gmail dot com if you have anymore questions/comments/suggestions/anything.

1)Reception at underground stations isnt a problem because the algorithm verifies the user has exited the station with a combination of waiting and testing + signal strength. If the MTA installs “cell towers” or fempto cells in the stations it would be excellent. Because then we could simply check the basestation ID, see that it is the ID for the 59th street station and report an arrived location. Fewer users would be required as users would be reporting station arrivals as they progressed through the tunnels.

The algorithm also uses the bridges and open cuts. Because at these points a large number of users emerge all together on the train and report the position.

2)GPS reception is always tricky in the city. But with assistedGPS which uses cell site trilateration, the results are acceptable.

3) As mentioned above we currently do not report locations for elevated tracks. However we are working on an addition to the system which would be used for above ground transit such as elevated tracks and especially Busses. Busses would be amazing. In the winter I never take the bus as it means waiting in the freezing cold. I think that above ground tracking, like busses, using the passive mobile crowd sourcing approach could provide realtime tracking for vehicles in cities around the world with minimal cost. I am happy to talk more on the subject.

4) The MTA no longer restricts usage of their trademarks. Instead they simple require developers to contact the MTA and have usage approved. http://mta.info/developers/

Alon Levy February 21, 2011 - 9:21 pm

There are two parts to subway tracking in a system like New York’s: figuring out where the train is, and figuring out which train it is. How do you figure out whether the next train on the express track is a 4 or 5? Do you just keep track of the entire run and then see which branch the train is coming from?

Andrew February 21, 2011 - 10:31 pm

Or a rerouted 2? Or a 6? For that matter, how do you tell express from local? Or northbound from southbound?

It’s an interesting idea,

Also, NYCT already has precise real-time train location data on most of the A Division and on the L train, and (I hope) it’s only a matter of time before it’s released to the public. If I were doing this, I wouldn’t even bother with the parts of the system equipped with ATS.

Anon256 February 21, 2011 - 10:44 pm

You can conclude that it’s a local if you register somebody getting off it at a local stop.

Alon Levy February 22, 2011 - 1:43 am

Reroutes you can deal with by hooking into the route change system. Not a big issue.

The bigger problem is what to do with trains that have the same origin but different destinations – e.g. the various A south ends. If I’m not mistaken the A keeps the same south end back and forth, so it’s still doable by tracking each train through the terminus, but it’s harder with one-way special runs like the 7/ and J/Z.

Alon Levy February 22, 2011 - 4:54 pm

To clarify: the last sentence should read “like the local and express 7.” I tried to put the 7 in angular brackets, but it disappeared due to HTML parsing.

Andrew February 22, 2011 - 10:59 pm

What route change system? I’m referring to on-the-fly reroutes – e.g., there’s a sick passenger at Times Square, so a 2 is sent up the Lex.

The map doesn’t show every single route variation. There are loads of trains that enter or leave service at the closest station to the yard, for instance. There are E’s to 179th, there are A’s that start at 59th, there are 1’s to 137th, there are F’s to Kings Highway, there are N’s and Q’s to 57th (even on weekdays), there are 2’s to New Lots, there are 5’s to Utica.

Alon Levy February 23, 2011 - 8:41 pm

Route change system means knowing in advance that e.g. due to weekend work, the E runs through 63rd.

Route variations in which trains with the same origin have different destinations can be dealt with in one of 2 ways. First, if trains leave the terminal on a schedule, or could report variations to the system, then the system could know that e.g. the train that left South Ferry at 5:01 is actually short-turning at 137th. Second, for some services, the system could track trains back and forth – e.g. if the train that left Flatbush had gotten there from the West Side tracks, then it’s still a 2, or (I think but am not sure) if the train that left 207th had gotten there from Lefferts then it’s going to Lefferts again.

Andrew February 24, 2011 - 8:51 am

You’re putting a lot of faith in the schedules. I doubt there has been a single day since 1904 when everything has operated according to schedule. Service disruptions are a way of life.

Yes, it’s possible to determine in advance that E’s are scheduled to run through 63rd for the weekend, but it’s not possible to determine in advance that a bunch of E’s were sent down 63rd because of a signal failure at 23rd-Ely.

The train scheduled to leave South Ferry at 5:01 (in your example) might short-turn at 137th, but the train actually leaving South Ferry at 5:01 – if there is one! – might be the 4:58 running late. Or it might be the 5:01, but due to an earlier service disruption it’s running with the crew from the 4:34, and that crew is scheduled to sign out at 242nd, so the dispatcher decides to send that train all the way through to avoid having to pay overtime for an extra trip.

2 and 5 trains at Flatbush often swap roles (that’s why the maps are often the wrong ones). Lefferts and Far Rockaway trains often swap at 207th.

Streetsblog New York City » Today’s Headlines February 22, 2011 - 9:04 am

[…] Download a Free App and Help Crowdsource Subway Tracking (2nd Ave Sagas) […]

Ben Samra February 22, 2011 - 2:17 pm

Real time tracking is a complicated thing. This sounds like a long way to go for data that is less then real time. Arrival window of 1-2 minutes is quite a large one given a train’s time in station. It could be a viable alternative if we had cell reception in stations. Im not sure how the algorithm works, I guess the result is off set once the user leaves the station it assumes where the train should be? During peak hours by the time someone leaves the station another train or two would of arrived.

The MTA is committed to installing countdown clocks and that information is useful where it counts the most, on the platform or in the station where you don’t have cell reception to check when a train is coming. For other stations i use schedules on iTrans, which is fairly accurate given the train is operation on or near time, and can pin a train within 1-2 minutes the old-fashion way. If this crowd sourcing was cross checked with scheduled arrival time you could fairly say this train is on time/late and others can expect it down the line (on standalone lines). Any system that is less then real time just isn’t useful. London’s system is ideal and hopefully the MTA will eventually work toward networking the data.

Andrew February 22, 2011 - 11:04 pm

On the A Division and the L train, the countdown clocks are simply a user interface to the real-time data gathered by the ATS system. ATS knows where every train is, where each train is going, what stops each train is scheduled to make. The countdown clocks just translate that information into arrival times.

As soon as NYCT sees fit to release the real-time ATS train location data to the public, a developer will snatch it up and do exactly what Mr. Somerville did in London. That will cover the entire A Division and the Canarsie line. (Well, there are a few segments on the A Division that aren’t in the ATS system yet – the entire 7 and the 2/5 in the Bronx – but they’ll be added on in the coming years.)

The real challenge is the B Division, which doesn’t have ATS.

And I wouldn’t even attempt to rely on schedules.

Someone February 26, 2013 - 1:28 pm

This is a solution.

Cell phone signals as subway countdown clocks :: Second Ave. Sagas June 6, 2011 - 4:57 pm

[…] with Alex Bell, an engineering student at Columbia and the brother of an old friend of mine, about his transit app. His idea was simple: crowdsource train locations through user-submitted messages. Unfortunately, […]


Leave a Comment