Tuesday, July 19, 2011

MIT and JSTOR hacked



Okay, it was wrong to hack into the system but I have to chuckle a bit. I am neither a faculty member nor attend a university and thus I am denied access to JSTOR's vast archive of technical and academic papers. As with other journals and journal repositories a hefty subscription or per paper cost is demanded which is ironic and may even be close to an oxymoron for they claim status as "a not-for-profit organization".

"Cambridge man accused of stealing 4 million documents in MIT hack"

July 19th, 2011

Boston Herald

A federal indictment unsealed today charges a Cambridge man with computer intrusion, fraud and data theft in computer hacking incidents that targeted the Massachusetts Institute of Technology and JSTOR, a not-for-profit archive of scientific journals and academic work.

Aaron Swartz, 24, was charged in an indictment with wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer. If convicted on these charges, Swartz faces up to 35 years in prison, to be followed by three years of supervised release, restitution, forfeiture and a fine of up to $1 million.


The indictment alleges that between Sept. 24, 2010, and Jan. 6, 2011, Swartz contrived to break into a restricted computer wiring closet in a basement at MIT and to access MIT’s network without authorization from a computer switch within that closet. He is charged with doing this in order to download a major portion of JSTOR’s archive of digitized academic journal articles onto his computers and hard drives. JSTOR is a not-for-profit organization that has invested heavily in providing an online system for archiving, accessing, and searching digitized copies of over 1,000 academic journals. It is alleged that Swartz avoided MIT’s and JSTOR’s security efforts in order to distribute a significant proportion of JSTOR’s archive through one or more file-sharing sites.

The indictment alleges that Swartz’s repeated automatic downloads impaired JSTOR’s computers, brought down some of its servers, and deprived various computers at MIT from accessing JSTOR’s research. Even after JSTOR and MIT worked to block Swartz’s computers, Swartz allegedly returned with new methods for accessing JSTOR and downloading articles.

The indictment alleges that Swartz exploited MIT’s computer system to steal over four million articles from JSTOR, even though Swartz was not affiliated with MIT as a student, faculty member, or employee. In fact, during these events, Swartz was allegedly a fellow at a Boston-area university, through which he could have accessed JSTOR’s services and archive for legitimate research.

"Feds Charge Activist As Hacker For Downloading Millions of Academic Articles"

by

Ryan Singel

July 19th, 2011

Wired

Well-known coder and activist Aaron Swartz was arrested Tuesday, charged with violating federal hacking laws for downloading millions of academic articles from a subscription database service that MIT had given him access to. If convicted, Swartz faces up to 35 years in prison and a $1 million fine.

Swartz, the 24-year-old executive director of Demand Progress, has a history of downloading massive data sets, both to use in research and to release public domain documents from behind paywalls. Swartz, who was aware of the investigation, turned himself in Tuesday.

Disclosure: Swartz is a co-founder of Reddit, which like Wired.com is owned by Condé Nast. He is also a general friend of Wired.com, and has done coding work for Wired.

The grand jury indictment accuses Swartz of evading MIT’s attempts to kick his laptop off the network while downloading more than four million documents from JSTOR, a not-for-profit company that provides searchable, digitized copies of academic journals. The scraping, which took place from September 2010 to January 2011 via MIT’s network, was invasive enough to bring down JSTOR’s servers on several occasions.

According to the U.S. attorney’s office, Swartz was arraigned in U.S. District Court in Boston this morning where he pled not guilty to all counts. He is now free on a $100,000 unsecured bond. His next court date is Sept. 9, 2011 and he’s represented by Andrew Good of Good and Courmier.

The indictment alleges that Swartz, at the time a fellow at Harvard University, intended to distribute the documents on peer-to-peer networks. That did not happen, however, and all the documents have been returned to JSTOR.

JSTOR, the alleged victim in the case, did not refer the case to the feds, according to Heidi McGregor, the company’s vice president of Marketing & Communications, who said the company got the documents, a mixture of both copyrighted and public domain works, back from Swartz and was content with that.

As for whether JSTOR supports the prosecution, McGregor simply said that the company was not commenting on the matter. She noted, however, that JSTOR has a program for academics who want to do big research on the corpus, but usually faculty members ask permission or contact the company after being booted off the network for too much downloading.

“This makes no sense,” said Demand Progress Executive Director David Segalin a statement provided by Swartz to Wired.com before the arrest. “It’s like trying to put someone in jail for allegedly checking too many books out of the library.”

“It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal said.

JSTOR doesn’t go quite as far in its statement on the prosecution — though there are clear hints that they were not the ones who wanted a prosecution, and that they were subpoenaed to testify at the grand jury hearing by the federal government.

We stopped this downloading activity, and the individual responsible, Mr. Swartz, was identified. We secured from Mr. Swartz the content that was taken, and received confirmation that the content was not and would not be used, copied, transferred, or distributed.

The criminal investigation and today’s indictment of Mr. Swartz has been directed by the United States Attorney’s Office.

When asked about this, Christina Sterling, a spokeswoman for the U.S. Attorney’s office said, “I can’t speak specifically about this case, but fundamentally speaking, the U.S. Attorney’s Office makes own independent decisions regarding prosecution based on the merits of a case.”

But the feds clearly think they have a substantial hacking case on their hands, even though Swartz used guest accounts to access the network and is not accused of finding a security hole to slip through or using stolen credentials, as hacking is typically defined.

In essence, Swartz is accused of felony hacking for violating MIT and JSTOR’s terms of service. That legal theory has had mixed success — a federal court judge dismissed that argument in the Lori Drew cyberbullying case, but it was later reused with more success in a case brought against ticket scalpers who used automated means to buy tickets faster from Ticketmaster’s computer system.

“Stealing is stealing whether you use a computer command or a crowbar, and whether you take documents, data or dollars. It is equally harmful to the victim whether you sell what you have stolen or give it away,” said United States Attorney Carmen M. Ortiz in a press release.

The indictment accuses Swartz of repeatedly spoofing the MAC address — an identifier that is usually static — of his computer after MIT blocked his computer based on that number. Swartz also allegedly snuck an Acer laptop bought just for the downloading into a closet at MIT in order to get a persistent connection to the network.

Swartz allegedly hid his face from surveillance cameras by holding his bike helmet up to his face and looking through the ventilation holes when going in to swap out an external drive used to store the documents. Swartz also allegedly named his guest account “Gary Host,” with the nickname “Ghost.”

Why would Swartz want to download what is likely gigabytes of information? His history includes a study co-authored with Shireen Barday, which looked through thousands of law review articles looking for law professors who had been paid by industry patrons to write papers. That study was published in 2008 in the Stanford Law Review.

Swartz is no stranger to the feds being interested in his skills at prodigious downloads. In 2008, the federal court system decided to try out allowing free public access to its court record search system PACER at 17 libraries across the country. Swartz went to the 7th U.S. Circuit Court of Appeals library in Chicago and installed a small PERL script he had written. The code cycled sequentially through case numbers, requesting a new document from PACER every three seconds. In this manner, Swartz got nearly 20 million pages of court documents, which his script uploaded to Amazon’s EC2 cloud computing service.

While the documents are in the public record and free to share, PACER normally charges eight cents a page.

The courts reported him to the FBI, which investigated whether the public records were “exfiltrated.” After in-depth background searches, a luckless stakeout and futile attempts to get Swartz to talk, the FBI dropped the case.

The same anti-hacking statute was used to prosecute Lori Drew, who was charged criminally for participating in a MySpace cyberbullying scheme against a 13-year-old Missouri girl who later committed suicide. The case against Drew hinged on the government’s novel argument that violating MySpace’s terms of service was the legal equivalent of computer hacking and a violation of the Computer Fraud and Abuse Act.

A federal judge who presided over the prosecution tossed the guilty verdicts in July 2009, and the government declined to appeal.

No comments: