Skip to main content
Join the official 2020 Python Developers SurveyStart the survey!

A news classification tool developed for Improve the News, a project by Max Tegmark

Project description

Welcome to MIT News Classify's documentation!

MIT News Classify is a package containing several natural language processing (NLP) and ML-based models that have been finetuned on the NYT Annotated Corpus and its predefined news tags. It was developed in Summer 2020 as a tool for topic classification using NLP with the Tegmark Group at MIT. Some of its models have been integrated into the Tegmark Group's other projects, such as Improve The News.

TF-IDF Model

The TF-IDF model mainly looks for appearance of keywords in an article to classify them into topics. The keywords were compiled by running Naive Bayes on the New York Times Annotated Corpus to get the top unique words pertaining to each label.

Once a new article is fed into the TF-IDF model, the article gets transformed by TF-IDF (Term Frequency - Inverse Document Frequency) as if it were a new article in the New York Times Annotated Corpus. The output is then run through two layers of neural network (size 2500 and 500 respectively) to produce the predicted tags for the article.

N MB of model files will be downloaded, M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.tfidf.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

>>> from mitnewsclassify import tfidf

>>> tfidf.gettags("Republicans proceeded with the third night of their national convention, but many Americans — particularly those in the path of Hurricane Laura — were focused on more immediate concerns.")
['hurricanes and tropical storms']

>>> tfidf.gettags("The Milwaukee Bucks chose to boycott their Wednesday matchup against the Orlando Magic in protest of the police shooting of Jacob Blake, a 29-year-old Black man, in Wisconsin.")
['boycotts']

>>> tfidf.gettags("One superspreading event may be connected to about 20,000 Covid-19 cases in the Boston area, a researcher said on Tuesday. That event, a biotech conference attended by 200 people in late February, is now well known as a source of Covid-19 spread very early on in the pandemic. Here is how a virus spreads Here is how a virus spreads 01:45 'Ultimately, more than 90 cases were diagnosed in people associated with this conference or their contacts, raising suspicion that a superspreading event had occurred there,' the researchers wrote in their study. Superspreading occurs when one or a few infected people cause a cascade of transmissions of an infectious disease. The new study -- which has not yet been peer-reviewed but was posted to the online server medrxiv.org on Tuesday -- involved analyzing the impact of early superspreading events in the Boston area and provided 'direct evidence' that superspreading can profoundly alter the course of an epidemic. 'An unfortunate perfect storm' The researchers -- from the Broad Institute of MIT and Harvard in Cambridge and other various institutions -- conducted genetic analyses of coronavirus specimen samples in Massachusetts. The researchers sequenced and analyzed 772 complete genomes of the virus from the region. They found 80 introductions of the virus into the Boston, predominantly from elsewhere in the United States and Europe, and 'hundreds of cases from major outbreaks' in various settings, including the conference. Coronavirus quickly spread around the world starting late last year, new genetic analysis shows Coronavirus quickly spread around the world starting late last year, new genetic analysis shows The conference, held from February 26 to 27, was a 'perfect storm' and the superspreading there could have been connected to approximately 20,000 cases, Bronwyn MacInnis, a researcher at the Broad Institute who worked on the study, told CNN in an email on Tuesday. 'Many factors made the conference an unfortunate perfect storm as a superspreading event. That the virus was introduced at the conference at all was unlucky,' MacInnis wrote in the email. 'This is not a rigorous estimate but does communicate the scale,' MacInnis added. 'If tens of thousands of individuals seems large, it is important to point out that it is in context of a pandemic that has infected tens of millions of people.' Unseen Covid-19 cases began early, spread fast Unseen Covid-19 cases began early, spread fast 03:00 Timing was crucial. In late February, people were not yet aware of the pandemic risk. 'When it happened was critical: it was scheduled just as we were collectively beginning to appreciate the imminent threat of COVID at home--if it had been a week later the event likely would have been cancelled,' MacInnis wrote in the email. 'Also, because it happened early in the epidemic it had the chance to spread widely before extensive testing capacity, shutdowns, social distancing, and masking were in place,' she wrote. 'The other critical factor was the population the virus landed in: people who had come from many different places (including some where COVID was already circulating), and who then returned home, often unknowingly bringing the virus with them.' 'A much greater understanding of how easily and quickly this virus can be transmitted' While the researchers did not identify the conference in their study, The Boston Globe on Tuesday said it was an international meeting of leaders from the biotechnology company Biogen at the Marriott Long Wharf hotel in Boston. How 53 members of this choir were infected in 'super spreader' event How 53 members of this choir were infected in 'super spreader' event 03:03 'February 2020 was nearly a half year ago, and was a period when general knowledge about the coronavirus was limited,' Biogen said in a written statement to CNN on Tuesday. 'We were adhering closely to the prevailing official guidelines. We never would have knowingly put anyone at risk. When we learned a number of our colleagues were ill, we did not know the cause was COVID-19, but we immediately notified public health authorities and took steps to limit the spread.' The company noted in its statement that it joined a collaboration with the Broad Institute in April to share biological and medical data to advance knowledge around Covid-19. 'The world today has a much greater understanding of how easily and quickly this virus can be transmitted, and we are proud to contribute through this collaboration to the global effort to overcome COVID-19,' it said. Who or what is a super spreader? Dr. Sanjay Gupta's coronavirus podcast for June 18 explains. Who or what is a super spreader? Dr. Sanjay Gupta's coronavirus podcast for June 18 explains. Massachusetts Governor Charlie Baker said in a news conference on Tuesday that he saw the Biogen conference in February as a 'seminal event' in the coronavirus pandemic for the Boston area. 'I was criticized actually for saying a few months ago that the Biogen event was a seminal event with respect to corona here in the Commonwealth and I couldn't put a number on it at that point in time,' Baker said. 'This is no offense to anybody, but at that point in time, nobody was wearing masks, nobody was social distancing, nobody was even behaving with concern about the presence of the virus at all. I mean all rules of the game with respect to that have changed,' Baker said. 'It speaks to the power of that virus to move from one person to another to another.' Get CNN Health's weekly newsletter Sign up here to get The Results Are In with Dr. Sanjay Gupta every Tuesday from the CNN Health team. The new pre-print study also investigated the spread of the coronavirus in other settings across the Boston area, including a skilled nursing facility -- where 85% of residents and 37% of staff tested positive -- and a homeless shelter -- where the coronavirus was introduced seven times, including four that resulted in clusters of cases, according to the study. 'Our findings repeatedly highlight the close relationships between seemingly disconnected groups and populations: viruses from international business travel seeded major outbreaks among individuals experiencing homelessness, spread throughout the Boston area, and were exported to other domestic and international sites,' the researchers wrote in the study.")
['medicine and health', 'diseases and conditions', 'acquired immune deficiency syndrome', 'viruses']

>>> tfidf.gettags("Two people died and one person was injured as shots were fired late Tuesday in Kenosha during the third night of unrest in Wisconsin following the shooting of a Black man by police, Kenosha police said. The shooting was reported at about 11:45 p.m. in an area where protests have taken place, Kenosha police Lt. Joseph Nosalik said in a news release. Kenosha County Sheriff David Beth said one victim had been shot in the head and another in the chest late Tuesday, just before midnight, according to the Milwaukee Journal Sentinel. Beth didn’t know where the other person was shot, but his or her injuries are not believed to be life threatening. The shooting was under investigation and no other information was released. The victims have not been identified. Jacob Blake, who was shot shot multiple times by police in Wisconsin, is paralyzed, and it would “take a miracle” for him to walk again, his family’s attorney said Tuesday, while calling for the officer who opened fire to be arrested and others involved to lose their jobs. The shooting of Blake on Sunday in Kenosha — apparently in the back while three of his children looked on — was captured on cellphone video and ignited new protests over racial injustice in several cities, coming just three months after the death of George Floyd at the hands of Minneapolis police touched off a wider reckoning on race. Earlier Tuesday, Blake’s father spoke alongside other family members and lawyers, telling reporters that police shot his son “seven times, seven times, like he didn’t matter.” “But my son matters. He’s a human being and he matters,” said Blake’s father, who is also named Jacob Blake. The 29-year-old was in surgery Tuesday, said attorney Ben Crump, adding that the bullets severed Blake’s spinal cord and shattered his vertebrae. Another attorney said there was also severe damage to organs. “It’s going to take a miracle for Jacob Blake Jr. to ever walk again,” Crump said. The legal team plans to file a civil lawsuit against the police department over the shooting. Police have said little about what happened, other than that they were responding to a domestic dispute. The officers involved have not been named. The Wisconsin Department of Justice is investigating. Police fired tear gas for a third night Tuesday to disperse protesters who had gathered outside Kenosha’s courthouse, where some shook a protective fence and threw water bottles and fireworks at officers lined up behind it. Police then used armored vehicles and officers with shields pushed back the crowd when protesters ignored warnings to leave a nearby park. Wisconsin Gov. Tony Evers had called for calm Tuesday, while also declaring a state of emergency under which he doubled the National Guard deployment in Kenosha from 125 to 250. The night before crowds destroyed dozens of buildings and set more than 30 fires in the city’s downtown. “We cannot allow the cycle of systemic racism and injustice to continue,” said Evers, who is facing mounting pressure from Republicans over his handling of the unrest. “We also cannot continue going down this path of damage and destruction.” Blake’s mother, Julia Jackson, said the damage in Kenosha does not reflect what her family wants and that, if her son could see it, he would be “very unpleased.” She said the first thing her son said to her when she saw him was he was sorry. “He said, ‘I don’t want to be a burden on you guys,’” Jackson said. “’I want to be with my children, and I don’t think I’ll walk again.’” Three of the younger Blake’s sons — aged 3, 5 and 8 — were in the car at the time of the shooting, Crump said. It was the 8-year-old’s birthday, he added. The man who said he made the cellphone video of the shooting, 22-year-old Raysean White, said he saw Blake scuffling with three officers and heard them yell, “Drop the knife! Drop the knife!” before the gunfire erupted. He said he didn’t see a knife in Blake’s hands. In the footage, Blake walks from the sidewalk around the front of his SUV to his driver-side door as officers follow him with their guns drawn and shout at him. As Blake opens the door and leans into the SUV, an officer grabs his shirt from behind and opens fire. Seven shots can be heard, though it isn’t clear how many struck Blake or how many officers fired. Blake’s father told the Chicago Sun-Times that his son had eight holes in his body. Anger over the shooting has spilled into the streets of Kenosha and other cities, including Los Angeles, Wisconsin’s capital of Madison and in Minneapolis, the epicenter of the Black Lives Matter movement this summer following Floyd’s death. Hundreds of people again defied curfew Tuesday in Kenosha, where destruction marred protests the previous night as fires were set and businesses vandalized. There were 34 fires associated with that unrest, with 30 businesses destroyed or damaged along with an unknown number of residences, Kenosha Fire Chief Charles Leipzig told the Kenosha News. “Nobody deserves this,” said Pat Oertle, owner of Computer Adventure, surveying the damage on Tuesday. Computers were stolen, and the store was “destroyed,” she said. “This accomplishes nothing,” Oertle said. “This is not justice that they’re looking for.” U.S. Sen. Ron Johnson and U.S. Rep. Bryan Steil, both Republicans, called on the governor to do more to quell the unrest. Steil said he would request federal assistance if necessary. Evers continued to call for protesters to be peaceful. “Please do not allow the actions of a few distract us from the work we must do together to demand justice, equity, and accountability,” he said. Blake’s family also called for calm. “I really ask you and encourage everyone in Wisconsin and abroad to take a moment and examine your hearts,” Blake’s mother said. “Do Jacob justice on this level and examine your hearts. … As I pray for my son’s healing physically, emotionally and spiritually, I also have been praying even before this for the healing of our country.”")
['politics and government', 'demonstrations and riots', 'police']

getfeatures

class mitnewsclassify.tfidf.getfeatures(txt)

Gets the values of the second layer in the neural network for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • features (Tensor[float32]) - The 500 values

Example

>>> from mitnewsclassify import tfidf

>>> tfidf.getfeatures("A destructive storm is rising from warm waters. Again. America and the world are getting more frequent and bigger multibillion dollar tropical catastrophes like Hurricane Laura, which is menacing the U.S. Gulf Coast, because of a combination of increased coastal development, natural climate cycles, reductions in air pollution and man-made climate change, experts say. The list of recent whoppers keeps growing: Harvey, Irma, Maria, Florence, Michael, Dorian. And hurricane experts have no doubt that Laura will be right there with them. It’s a mess at least partially of our own making, said Susan Cutter, director of the Hazards and Vulnerability Institute at the University of South Carolina. “We are seeing an increase of intensity of these phenomena because we as a society are fundamentally changing the Earth and at the same time we are moving to locations that are more hazardous,” Cutter said Wednesday. In the last three years, the United States has had seven hurricane disasters that each caused at least $1 billion in damage, totaling $335 billion. In all of the 1980s, there were six, and their damage totaled $38.2 billion, according to the National Oceanic and Atmospheric Administration. All those figures are adjusted for the cost of living. The Atlantic is increasingly spawning more major hurricanes, according to an Associated Press analysis of NOAA hurricane data since 1950. That designation refers to storms with at least 111-mile-per-hour (179-kilometer-per-hour) winds that are the ones that do the most damage. The Atlantic now averages three major hurricanes a year, based on a 30-year running average. In the 1980s and 1990s, it was two. The Atlantic’s Accumulated Cyclone Energy — a measurement that takes into account the number of storms, their strength and how long they last — is now 120 on a 30-year running average. Thirty years ago, it was in the 70s or 80s on average. Some people argue the increase is due to unchecked coastal development, while others will point to man-made climate change from the burning of coal, oil and gas. In fact, both are responsible, said former Federal Emergency Management Agency chief Craig Fugate. “There’s a lot of factors going on,” he said. When it comes to hurricane risk, a major factor is “the amount of stuff in the way of natural peril and the vulnerability of the stuff in the way,” said Mark Bove, a meteorologist who works for the insurance firm Munich Re U.S. One factor that increases the possibility that there will be “stuff in the way” of a major storm is that federal disaster policy and flood insurance subsidize and encourage people to rebuild in risky areas, Fugate said. After storms, communities “always say they are going to rise from the ashes,” and, too often, they build the same way in the same place for the same vulnerability and the same outcome, Fugate said. In addition, some places, like Houston, don’t limit development in areas that could serve as flood control zones if left empty and allow development that’s not disaster resilient, said Kathleen Tierney, former director of the Natural Hazards Center at Colorado University. Now add in the meteorology. Scientists agree that waters are warming, and that serves as hurricane fuel, said NOAA climate scientist Jim Kossin. A study by Kossin found that, once a storm formed, the chances of its attaining major storm status globally increased by 8% a decade since 1979. In the Atlantic, chances went up by 49% a decade. But scientists disagree on why waters are warming. They know climate change is a factor — but they say it’s not the biggest driver and disagree on what else may be behind it. Some argue it’s because of a 25- to 30-year natural global cycle that acts like a giant conveyor belt, carrying different levels of salt and temperature around the globe, including into the part of the tropical Atlantic off Africa where the worst hurricanes form, Colorado State University hurricane researcher Phil Klotzbach said. When the water in the northern Atlantic is extra warm, the water in those tropical hurricane breeding grounds is unusually hot, and the hurricane season is abnormally active, Klotzbach said. Such a busy period started in 1995 and might end soon as northern Atlantic waters shift to a cooler regime, he said. Klotzbach acknowledged that one problem with this theory is that the waters in the northern Atlantic have been unusually cool this summer, and still there have been lots of storms. It may have been a blip, he said. But MIT meteorology professor Kerry Emanuel says it’s because another counterintuitive factor is at play: There are more storms because of cleaner air. European air pollution cooled the area over Africa in the 1960s and 1970s and put more dust into the air — both of which tamped down on any hurricanes, he said. When the pollution eased, Africa got warmer, more storms developed, and that’s why it’s such a busy period, Emanuel said. While climate change is not the most important factor in warming waters, it contributes to creating more damaging storms in other ways, by causing a rising sea level that worsens storm surges and making storms move more slowly and produce more rain, scientists say. All of this means that we should get used to more catastrophic storms, according to Munich Re’s Bove. In addition, he said: “Climate change will be a bigger driver of losses in the future.”")
<tf.Tensor: shape=(1, 500), dtype=float32, numpy=
array([[0.        , 2.035066  , 0.        , 1.1831102 , 1.8716689 ,
        ...
        2.4259803 , 0.30450952, 0.        , 0.        , 0.5605872 ]],
      dtype=float32)>

TF-IDF Bigrams Model

The TF-IDF Bigrams model mainly looks for appearance of keywords in an article to classify them into topics. In this case, the keywords refer to a phrase of two words occuring adjacent to each other, for instance "nuclear weapons." The keywords were compiled by running Naive Bayes on the New York Times Annotated Corpus to get the top unique phrases pertaining to each label.

Once a new article is fed into the TF-IDF model, the article gets transformed by TF-IDF (Term Frequency - Inverse Document Frequency) as if it were a new article in the New York Times Annotated Corpus. The output is then run through two layers of neural network (size 2000 and 500 respectively) to produce the predicted tags for the article.

N MB of model files will be downloaded, M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.tfidf_bi.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

>>> from mitnewsclassify import tfidf_bi

>>> tfidf_bi.gettags("Republicans proceeded with the third night of their national convention, but many Americans — particularly those in the path of Hurricane Laura — were focused on more immediate concerns.")
['conventions and conferences']

>>> tfidf_bi.gettags("The Milwaukee Bucks chose to boycott their Wednesday matchup against the Orlando Magic in protest of the police shooting of Jacob Blake, a 29-year-old Black man, in Wisconsin.")
[]

>>> tfidf_bi.gettags("One superspreading event may be connected to about 20,000 Covid-19 cases in the Boston area, a researcher said on Tuesday. That event, a biotech conference attended by 200 people in late February, is now well known as a source of Covid-19 spread very early on in the pandemic. Here is how a virus spreads Here is how a virus spreads 01:45 'Ultimately, more than 90 cases were diagnosed in people associated with this conference or their contacts, raising suspicion that a superspreading event had occurred there,' the researchers wrote in their study. Superspreading occurs when one or a few infected people cause a cascade of transmissions of an infectious disease. The new study -- which has not yet been peer-reviewed but was posted to the online server medrxiv.org on Tuesday -- involved analyzing the impact of early superspreading events in the Boston area and provided 'direct evidence' that superspreading can profoundly alter the course of an epidemic. 'An unfortunate perfect storm' The researchers -- from the Broad Institute of MIT and Harvard in Cambridge and other various institutions -- conducted genetic analyses of coronavirus specimen samples in Massachusetts. The researchers sequenced and analyzed 772 complete genomes of the virus from the region. They found 80 introductions of the virus into the Boston, predominantly from elsewhere in the United States and Europe, and 'hundreds of cases from major outbreaks' in various settings, including the conference. Coronavirus quickly spread around the world starting late last year, new genetic analysis shows Coronavirus quickly spread around the world starting late last year, new genetic analysis shows The conference, held from February 26 to 27, was a 'perfect storm' and the superspreading there could have been connected to approximately 20,000 cases, Bronwyn MacInnis, a researcher at the Broad Institute who worked on the study, told CNN in an email on Tuesday. 'Many factors made the conference an unfortunate perfect storm as a superspreading event. That the virus was introduced at the conference at all was unlucky,' MacInnis wrote in the email. 'This is not a rigorous estimate but does communicate the scale,' MacInnis added. 'If tens of thousands of individuals seems large, it is important to point out that it is in context of a pandemic that has infected tens of millions of people.' Unseen Covid-19 cases began early, spread fast Unseen Covid-19 cases began early, spread fast 03:00 Timing was crucial. In late February, people were not yet aware of the pandemic risk. 'When it happened was critical: it was scheduled just as we were collectively beginning to appreciate the imminent threat of COVID at home--if it had been a week later the event likely would have been cancelled,' MacInnis wrote in the email. 'Also, because it happened early in the epidemic it had the chance to spread widely before extensive testing capacity, shutdowns, social distancing, and masking were in place,' she wrote. 'The other critical factor was the population the virus landed in: people who had come from many different places (including some where COVID was already circulating), and who then returned home, often unknowingly bringing the virus with them.' 'A much greater understanding of how easily and quickly this virus can be transmitted' While the researchers did not identify the conference in their study, The Boston Globe on Tuesday said it was an international meeting of leaders from the biotechnology company Biogen at the Marriott Long Wharf hotel in Boston. How 53 members of this choir were infected in &#39;super spreader&#39; event How 53 members of this choir were infected in 'super spreader' event 03:03 'February 2020 was nearly a half year ago, and was a period when general knowledge about the coronavirus was limited,' Biogen said in a written statement to CNN on Tuesday. 'We were adhering closely to the prevailing official guidelines. We never would have knowingly put anyone at risk. When we learned a number of our colleagues were ill, we did not know the cause was COVID-19, but we immediately notified public health authorities and took steps to limit the spread.' The company noted in its statement that it joined a collaboration with the Broad Institute in April to share biological and medical data to advance knowledge around Covid-19. 'The world today has a much greater understanding of how easily and quickly this virus can be transmitted, and we are proud to contribute through this collaboration to the global effort to overcome COVID-19,' it said. Who or what is a super spreader? Dr. Sanjay Gupta&#39;s coronavirus podcast for June 18 explains. Who or what is a super spreader? Dr. Sanjay Gupta's coronavirus podcast for June 18 explains. Massachusetts Governor Charlie Baker said in a news conference on Tuesday that he saw the Biogen conference in February as a 'seminal event' in the coronavirus pandemic for the Boston area. 'I was criticized actually for saying a few months ago that the Biogen event was a seminal event with respect to corona here in the Commonwealth and I couldn't put a number on it at that point in time,' Baker said. 'This is no offense to anybody, but at that point in time, nobody was wearing masks, nobody was social distancing, nobody was even behaving with concern about the presence of the virus at all. I mean all rules of the game with respect to that have changed,' Baker said. 'It speaks to the power of that virus to move from one person to another to another.' Get CNN Health's weekly newsletter Sign up here to get The Results Are In with Dr. Sanjay Gupta every Tuesday from the CNN Health team. The new pre-print study also investigated the spread of the coronavirus in other settings across the Boston area, including a skilled nursing facility -- where 85% of residents and 37% of staff tested positive -- and a homeless shelter -- where the coronavirus was introduced seven times, including four that resulted in clusters of cases, according to the study. 'Our findings repeatedly highlight the close relationships between seemingly disconnected groups and populations: viruses from international business travel seeded major outbreaks among individuals experiencing homelessness, spread throughout the Boston area, and were exported to other domestic and international sites,' the researchers wrote in the study.")
['medicine and health', 'diseases and conditions']

>>> tfidf_bi.gettags("Two people died and one person was injured as shots were fired late Tuesday in Kenosha during the third night of unrest in Wisconsin following the shooting of a Black man by police, Kenosha police said. The shooting was reported at about 11:45 p.m. in an area where protests have taken place, Kenosha police Lt. Joseph Nosalik said in a news release. Kenosha County Sheriff David Beth said one victim had been shot in the head and another in the chest late Tuesday, just before midnight, according to the Milwaukee Journal Sentinel. Beth didn’t know where the other person was shot, but his or her injuries are not believed to be life threatening. The shooting was under investigation and no other information was released. The victims have not been identified. Jacob Blake, who was shot shot multiple times by police in Wisconsin, is paralyzed, and it would “take a miracle” for him to walk again, his family’s attorney said Tuesday, while calling for the officer who opened fire to be arrested and others involved to lose their jobs. The shooting of Blake on Sunday in Kenosha — apparently in the back while three of his children looked on — was captured on cellphone video and ignited new protests over racial injustice in several cities, coming just three months after the death of George Floyd at the hands of Minneapolis police touched off a wider reckoning on race. Earlier Tuesday, Blake’s father spoke alongside other family members and lawyers, telling reporters that police shot his son “seven times, seven times, like he didn’t matter.” “But my son matters. He’s a human being and he matters,” said Blake’s father, who is also named Jacob Blake. The 29-year-old was in surgery Tuesday, said attorney Ben Crump, adding that the bullets severed Blake’s spinal cord and shattered his vertebrae. Another attorney said there was also severe damage to organs. “It’s going to take a miracle for Jacob Blake Jr. to ever walk again,” Crump said. The legal team plans to file a civil lawsuit against the police department over the shooting. Police have said little about what happened, other than that they were responding to a domestic dispute. The officers involved have not been named. The Wisconsin Department of Justice is investigating. Police fired tear gas for a third night Tuesday to disperse protesters who had gathered outside Kenosha’s courthouse, where some shook a protective fence and threw water bottles and fireworks at officers lined up behind it. Police then used armored vehicles and officers with shields pushed back the crowd when protesters ignored warnings to leave a nearby park. Wisconsin Gov. Tony Evers had called for calm Tuesday, while also declaring a state of emergency under which he doubled the National Guard deployment in Kenosha from 125 to 250. The night before crowds destroyed dozens of buildings and set more than 30 fires in the city’s downtown. “We cannot allow the cycle of systemic racism and injustice to continue,” said Evers, who is facing mounting pressure from Republicans over his handling of the unrest. “We also cannot continue going down this path of damage and destruction.” Blake’s mother, Julia Jackson, said the damage in Kenosha does not reflect what her family wants and that, if her son could see it, he would be “very unpleased.” She said the first thing her son said to her when she saw him was he was sorry. “He said, ‘I don’t want to be a burden on you guys,’” Jackson said. “’I want to be with my children, and I don’t think I’ll walk again.’” Three of the younger Blake’s sons — aged 3, 5 and 8 — were in the car at the time of the shooting, Crump said. It was the 8-year-old’s birthday, he added. The man who said he made the cellphone video of the shooting, 22-year-old Raysean White, said he saw Blake scuffling with three officers and heard them yell, “Drop the knife! Drop the knife!” before the gunfire erupted. He said he didn’t see a knife in Blake’s hands. In the footage, Blake walks from the sidewalk around the front of his SUV to his driver-side door as officers follow him with their guns drawn and shout at him. As Blake opens the door and leans into the SUV, an officer grabs his shirt from behind and opens fire. Seven shots can be heard, though it isn’t clear how many struck Blake or how many officers fired. Blake’s father told the Chicago Sun-Times that his son had eight holes in his body. Anger over the shooting has spilled into the streets of Kenosha and other cities, including Los Angeles, Wisconsin’s capital of Madison and in Minneapolis, the epicenter of the Black Lives Matter movement this summer following Floyd’s death. Hundreds of people again defied curfew Tuesday in Kenosha, where destruction marred protests the previous night as fires were set and businesses vandalized. There were 34 fires associated with that unrest, with 30 businesses destroyed or damaged along with an unknown number of residences, Kenosha Fire Chief Charles Leipzig told the Kenosha News. “Nobody deserves this,” said Pat Oertle, owner of Computer Adventure, surveying the damage on Tuesday. Computers were stolen, and the store was “destroyed,” she said. “This accomplishes nothing,” Oertle said. “This is not justice that they’re looking for.” U.S. Sen. Ron Johnson and U.S. Rep. Bryan Steil, both Republicans, called on the governor to do more to quell the unrest. Steil said he would request federal assistance if necessary. Evers continued to call for protesters to be peaceful. “Please do not allow the actions of a few distract us from the work we must do together to demand justice, equity, and accountability,” he said. Blake’s family also called for calm. “I really ask you and encourage everyone in Wisconsin and abroad to take a moment and examine your hearts,” Blake’s mother said. “Do Jacob justice on this level and examine your hearts. … As I pray for my son’s healing physically, emotionally and spiritually, I also have been praying even before this for the healing of our country.”")
['demonstrations and riots']

getfeatures

class mitnewsclassify.tfidf_bi.getfeatures(txt)

Gets the values of the second layer in the neural network for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • features (Tensor[float32]) - The 500 values

Example

>>> from mitnewsclassify import tfidf_bi

>>> tfidf_bi.getfeatures("A destructive storm is rising from warm waters. Again. America and the world are getting more frequent and bigger multibillion dollar tropical catastrophes like Hurricane Laura, which is menacing the U.S. Gulf Coast, because of a combination of increased coastal development, natural climate cycles, reductions in air pollution and man-made climate change, experts say. The list of recent whoppers keeps growing: Harvey, Irma, Maria, Florence, Michael, Dorian. And hurricane experts have no doubt that Laura will be right there with them. It’s a mess at least partially of our own making, said Susan Cutter, director of the Hazards and Vulnerability Institute at the University of South Carolina. “We are seeing an increase of intensity of these phenomena because we as a society are fundamentally changing the Earth and at the same time we are moving to locations that are more hazardous,” Cutter said Wednesday. In the last three years, the United States has had seven hurricane disasters that each caused at least $1 billion in damage, totaling $335 billion. In all of the 1980s, there were six, and their damage totaled $38.2 billion, according to the National Oceanic and Atmospheric Administration. All those figures are adjusted for the cost of living. The Atlantic is increasingly spawning more major hurricanes, according to an Associated Press analysis of NOAA hurricane data since 1950. That designation refers to storms with at least 111-mile-per-hour (179-kilometer-per-hour) winds that are the ones that do the most damage. The Atlantic now averages three major hurricanes a year, based on a 30-year running average. In the 1980s and 1990s, it was two. The Atlantic’s Accumulated Cyclone Energy — a measurement that takes into account the number of storms, their strength and how long they last — is now 120 on a 30-year running average. Thirty years ago, it was in the 70s or 80s on average. Some people argue the increase is due to unchecked coastal development, while others will point to man-made climate change from the burning of coal, oil and gas. In fact, both are responsible, said former Federal Emergency Management Agency chief Craig Fugate. “There’s a lot of factors going on,” he said. When it comes to hurricane risk, a major factor is “the amount of stuff in the way of natural peril and the vulnerability of the stuff in the way,” said Mark Bove, a meteorologist who works for the insurance firm Munich Re U.S. One factor that increases the possibility that there will be “stuff in the way” of a major storm is that federal disaster policy and flood insurance subsidize and encourage people to rebuild in risky areas, Fugate said. After storms, communities “always say they are going to rise from the ashes,” and, too often, they build the same way in the same place for the same vulnerability and the same outcome, Fugate said. In addition, some places, like Houston, don’t limit development in areas that could serve as flood control zones if left empty and allow development that’s not disaster resilient, said Kathleen Tierney, former director of the Natural Hazards Center at Colorado University. Now add in the meteorology. Scientists agree that waters are warming, and that serves as hurricane fuel, said NOAA climate scientist Jim Kossin. A study by Kossin found that, once a storm formed, the chances of its attaining major storm status globally increased by 8% a decade since 1979. In the Atlantic, chances went up by 49% a decade. But scientists disagree on why waters are warming. They know climate change is a factor — but they say it’s not the biggest driver and disagree on what else may be behind it. Some argue it’s because of a 25- to 30-year natural global cycle that acts like a giant conveyor belt, carrying different levels of salt and temperature around the globe, including into the part of the tropical Atlantic off Africa where the worst hurricanes form, Colorado State University hurricane researcher Phil Klotzbach said. When the water in the northern Atlantic is extra warm, the water in those tropical hurricane breeding grounds is unusually hot, and the hurricane season is abnormally active, Klotzbach said. Such a busy period started in 1995 and might end soon as northern Atlantic waters shift to a cooler regime, he said. Klotzbach acknowledged that one problem with this theory is that the waters in the northern Atlantic have been unusually cool this summer, and still there have been lots of storms. It may have been a blip, he said. But MIT meteorology professor Kerry Emanuel says it’s because another counterintuitive factor is at play: There are more storms because of cleaner air. European air pollution cooled the area over Africa in the 1960s and 1970s and put more dust into the air — both of which tamped down on any hurricanes, he said. When the pollution eased, Africa got warmer, more storms developed, and that’s why it’s such a busy period, Emanuel said. While climate change is not the most important factor in warming waters, it contributes to creating more damaging storms in other ways, by causing a rising sea level that worsens storm surges and making storms move more slowly and produce more rain, scientists say. All of this means that we should get used to more catastrophic storms, according to Munich Re’s Bove. In addition, he said: “Climate change will be a bigger driver of losses in the future.”")
<tf.Tensor: shape=(1, 500), dtype=float32, numpy=
array([[1.4679118e+00, 7.8771174e-01, 6.4926636e-01, 5.0401473e-01,
        ...
        1.0278748e+00, 9.7658831e-01, 9.7091949e-01, 1.3398045e-01]],
      dtype=float32)>

Doc2Vec Model

The Doc2Vec model applies gensim's implementation of Quoc Le and Tomas Mikolov: “Distributed Representations of Sentences and Documents”, which embeds an entire document into a vector based on the pattens of words that occur next to one another. The model is trained using 5 neighbors, 10 epochs and vectors of size 2000, and is saved on our server.

Once a new article is fed into the Doc2Vec model, the article gets embedded through the saved model as if it were a new article in the New York Times Annotated Corpus. The output is then run through two layers of neural network (size 1200 and 800 respectively) to produce the predicted tags for the article.

N GB of model files will be downloaded, M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.doc2vec.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

>>> from mitnewsclassify import doc2vec

>>> doc2vec.gettags("Republicans proceeded with the third night of their national convention, but many Americans — particularly those in the path of Hurricane Laura — were focused on more immediate concerns.")
[]

>>> doc2vec.gettags("The Milwaukee Bucks chose to boycott their Wednesday matchup against the Orlando Magic in protest of the police shooting of Jacob Blake, a 29-year-old Black man, in Wisconsin.")
['demonstrations and riots']

>>> doc2vec.gettags("One superspreading event may be connected to about 20,000 Covid-19 cases in the Boston area, a researcher said on Tuesday. That event, a biotech conference attended by 200 people in late February, is now well known as a source of Covid-19 spread very early on in the pandemic. Here is how a virus spreads Here is how a virus spreads 01:45 'Ultimately, more than 90 cases were diagnosed in people associated with this conference or their contacts, raising suspicion that a superspreading event had occurred there,' the researchers wrote in their study. Superspreading occurs when one or a few infected people cause a cascade of transmissions of an infectious disease. The new study -- which has not yet been peer-reviewed but was posted to the online server medrxiv.org on Tuesday -- involved analyzing the impact of early superspreading events in the Boston area and provided 'direct evidence' that superspreading can profoundly alter the course of an epidemic. 'An unfortunate perfect storm' The researchers -- from the Broad Institute of MIT and Harvard in Cambridge and other various institutions -- conducted genetic analyses of coronavirus specimen samples in Massachusetts. The researchers sequenced and analyzed 772 complete genomes of the virus from the region. They found 80 introductions of the virus into the Boston, predominantly from elsewhere in the United States and Europe, and 'hundreds of cases from major outbreaks' in various settings, including the conference. Coronavirus quickly spread around the world starting late last year, new genetic analysis shows Coronavirus quickly spread around the world starting late last year, new genetic analysis shows The conference, held from February 26 to 27, was a 'perfect storm' and the superspreading there could have been connected to approximately 20,000 cases, Bronwyn MacInnis, a researcher at the Broad Institute who worked on the study, told CNN in an email on Tuesday. 'Many factors made the conference an unfortunate perfect storm as a superspreading event. That the virus was introduced at the conference at all was unlucky,' MacInnis wrote in the email. 'This is not a rigorous estimate but does communicate the scale,' MacInnis added. 'If tens of thousands of individuals seems large, it is important to point out that it is in context of a pandemic that has infected tens of millions of people.' Unseen Covid-19 cases began early, spread fast Unseen Covid-19 cases began early, spread fast 03:00 Timing was crucial. In late February, people were not yet aware of the pandemic risk. 'When it happened was critical: it was scheduled just as we were collectively beginning to appreciate the imminent threat of COVID at home--if it had been a week later the event likely would have been cancelled,' MacInnis wrote in the email. 'Also, because it happened early in the epidemic it had the chance to spread widely before extensive testing capacity, shutdowns, social distancing, and masking were in place,' she wrote. 'The other critical factor was the population the virus landed in: people who had come from many different places (including some where COVID was already circulating), and who then returned home, often unknowingly bringing the virus with them.' 'A much greater understanding of how easily and quickly this virus can be transmitted' While the researchers did not identify the conference in their study, The Boston Globe on Tuesday said it was an international meeting of leaders from the biotechnology company Biogen at the Marriott Long Wharf hotel in Boston. How 53 members of this choir were infected in &#39;super spreader&#39; event How 53 members of this choir were infected in 'super spreader' event 03:03 'February 2020 was nearly a half year ago, and was a period when general knowledge about the coronavirus was limited,' Biogen said in a written statement to CNN on Tuesday. 'We were adhering closely to the prevailing official guidelines. We never would have knowingly put anyone at risk. When we learned a number of our colleagues were ill, we did not know the cause was COVID-19, but we immediately notified public health authorities and took steps to limit the spread.' The company noted in its statement that it joined a collaboration with the Broad Institute in April to share biological and medical data to advance knowledge around Covid-19. 'The world today has a much greater understanding of how easily and quickly this virus can be transmitted, and we are proud to contribute through this collaboration to the global effort to overcome COVID-19,' it said. Who or what is a super spreader? Dr. Sanjay Gupta&#39;s coronavirus podcast for June 18 explains. Who or what is a super spreader? Dr. Sanjay Gupta's coronavirus podcast for June 18 explains. Massachusetts Governor Charlie Baker said in a news conference on Tuesday that he saw the Biogen conference in February as a 'seminal event' in the coronavirus pandemic for the Boston area. 'I was criticized actually for saying a few months ago that the Biogen event was a seminal event with respect to corona here in the Commonwealth and I couldn't put a number on it at that point in time,' Baker said. 'This is no offense to anybody, but at that point in time, nobody was wearing masks, nobody was social distancing, nobody was even behaving with concern about the presence of the virus at all. I mean all rules of the game with respect to that have changed,' Baker said. 'It speaks to the power of that virus to move from one person to another to another.' Get CNN Health's weekly newsletter Sign up here to get The Results Are In with Dr. Sanjay Gupta every Tuesday from the CNN Health team. The new pre-print study also investigated the spread of the coronavirus in other settings across the Boston area, including a skilled nursing facility -- where 85% of residents and 37% of staff tested positive -- and a homeless shelter -- where the coronavirus was introduced seven times, including four that resulted in clusters of cases, according to the study. 'Our findings repeatedly highlight the close relationships between seemingly disconnected groups and populations: viruses from international business travel seeded major outbreaks among individuals experiencing homelessness, spread throughout the Boston area, and were exported to other domestic and international sites,' the researchers wrote in the study.")
['medicine and health', 'diseases and conditions', 'acquired immune deficiency syndrome', 'viruses']

>>> doc2vec.gettags("Two people died and one person was injured as shots were fired late Tuesday in Kenosha during the third night of unrest in Wisconsin following the shooting of a Black man by police, Kenosha police said. The shooting was reported at about 11:45 p.m. in an area where protests have taken place, Kenosha police Lt. Joseph Nosalik said in a news release. Kenosha County Sheriff David Beth said one victim had been shot in the head and another in the chest late Tuesday, just before midnight, according to the Milwaukee Journal Sentinel. Beth didn’t know where the other person was shot, but his or her injuries are not believed to be life threatening. The shooting was under investigation and no other information was released. The victims have not been identified. Jacob Blake, who was shot shot multiple times by police in Wisconsin, is paralyzed, and it would “take a miracle” for him to walk again, his family’s attorney said Tuesday, while calling for the officer who opened fire to be arrested and others involved to lose their jobs. The shooting of Blake on Sunday in Kenosha — apparently in the back while three of his children looked on — was captured on cellphone video and ignited new protests over racial injustice in several cities, coming just three months after the death of George Floyd at the hands of Minneapolis police touched off a wider reckoning on race. Earlier Tuesday, Blake’s father spoke alongside other family members and lawyers, telling reporters that police shot his son “seven times, seven times, like he didn’t matter.” “But my son matters. He’s a human being and he matters,” said Blake’s father, who is also named Jacob Blake. The 29-year-old was in surgery Tuesday, said attorney Ben Crump, adding that the bullets severed Blake’s spinal cord and shattered his vertebrae. Another attorney said there was also severe damage to organs. “It’s going to take a miracle for Jacob Blake Jr. to ever walk again,” Crump said. The legal team plans to file a civil lawsuit against the police department over the shooting. Police have said little about what happened, other than that they were responding to a domestic dispute. The officers involved have not been named. The Wisconsin Department of Justice is investigating. Police fired tear gas for a third night Tuesday to disperse protesters who had gathered outside Kenosha’s courthouse, where some shook a protective fence and threw water bottles and fireworks at officers lined up behind it. Police then used armored vehicles and officers with shields pushed back the crowd when protesters ignored warnings to leave a nearby park. Wisconsin Gov. Tony Evers had called for calm Tuesday, while also declaring a state of emergency under which he doubled the National Guard deployment in Kenosha from 125 to 250. The night before crowds destroyed dozens of buildings and set more than 30 fires in the city’s downtown. “We cannot allow the cycle of systemic racism and injustice to continue,” said Evers, who is facing mounting pressure from Republicans over his handling of the unrest. “We also cannot continue going down this path of damage and destruction.” Blake’s mother, Julia Jackson, said the damage in Kenosha does not reflect what her family wants and that, if her son could see it, he would be “very unpleased.” She said the first thing her son said to her when she saw him was he was sorry. “He said, ‘I don’t want to be a burden on you guys,’” Jackson said. “’I want to be with my children, and I don’t think I’ll walk again.’” Three of the younger Blake’s sons — aged 3, 5 and 8 — were in the car at the time of the shooting, Crump said. It was the 8-year-old’s birthday, he added. The man who said he made the cellphone video of the shooting, 22-year-old Raysean White, said he saw Blake scuffling with three officers and heard them yell, “Drop the knife! Drop the knife!” before the gunfire erupted. He said he didn’t see a knife in Blake’s hands. In the footage, Blake walks from the sidewalk around the front of his SUV to his driver-side door as officers follow him with their guns drawn and shout at him. As Blake opens the door and leans into the SUV, an officer grabs his shirt from behind and opens fire. Seven shots can be heard, though it isn’t clear how many struck Blake or how many officers fired. Blake’s father told the Chicago Sun-Times that his son had eight holes in his body. Anger over the shooting has spilled into the streets of Kenosha and other cities, including Los Angeles, Wisconsin’s capital of Madison and in Minneapolis, the epicenter of the Black Lives Matter movement this summer following Floyd’s death. Hundreds of people again defied curfew Tuesday in Kenosha, where destruction marred protests the previous night as fires were set and businesses vandalized. There were 34 fires associated with that unrest, with 30 businesses destroyed or damaged along with an unknown number of residences, Kenosha Fire Chief Charles Leipzig told the Kenosha News. “Nobody deserves this,” said Pat Oertle, owner of Computer Adventure, surveying the damage on Tuesday. Computers were stolen, and the store was “destroyed,” she said. “This accomplishes nothing,” Oertle said. “This is not justice that they’re looking for.” U.S. Sen. Ron Johnson and U.S. Rep. Bryan Steil, both Republicans, called on the governor to do more to quell the unrest. Steil said he would request federal assistance if necessary. Evers continued to call for protesters to be peaceful. “Please do not allow the actions of a few distract us from the work we must do together to demand justice, equity, and accountability,” he said. Blake’s family also called for calm. “I really ask you and encourage everyone in Wisconsin and abroad to take a moment and examine your hearts,” Blake’s mother said. “Do Jacob justice on this level and examine your hearts. … As I pray for my son’s healing physically, emotionally and spiritually, I also have been praying even before this for the healing of our country.”")
['politics and government', 'elections', 'presidents and presidency (us)', 'presidential elections (us)', 'demonstrations and riots', 'police', 'blacks', 'police brutality and misconduct']

getfeatures

class mitnewsclassify.doc2vec.getfeatures(txt)

Gets the values of the second layer in the neural network for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • features (Tensor[float32]) - The 800 values

Example

>>> from mitnewsclassify import doc2vec

>>> doc2vec.getfeatures("A destructive storm is rising from warm waters. Again. America and the world are getting more frequent and bigger multibillion dollar tropical catastrophes like Hurricane Laura, which is menacing the U.S. Gulf Coast, because of a combination of increased coastal development, natural climate cycles, reductions in air pollution and man-made climate change, experts say. The list of recent whoppers keeps growing: Harvey, Irma, Maria, Florence, Michael, Dorian. And hurricane experts have no doubt that Laura will be right there with them. It’s a mess at least partially of our own making, said Susan Cutter, director of the Hazards and Vulnerability Institute at the University of South Carolina. “We are seeing an increase of intensity of these phenomena because we as a society are fundamentally changing the Earth and at the same time we are moving to locations that are more hazardous,” Cutter said Wednesday. In the last three years, the United States has had seven hurricane disasters that each caused at least $1 billion in damage, totaling $335 billion. In all of the 1980s, there were six, and their damage totaled $38.2 billion, according to the National Oceanic and Atmospheric Administration. All those figures are adjusted for the cost of living. The Atlantic is increasingly spawning more major hurricanes, according to an Associated Press analysis of NOAA hurricane data since 1950. That designation refers to storms with at least 111-mile-per-hour (179-kilometer-per-hour) winds that are the ones that do the most damage. The Atlantic now averages three major hurricanes a year, based on a 30-year running average. In the 1980s and 1990s, it was two. The Atlantic’s Accumulated Cyclone Energy — a measurement that takes into account the number of storms, their strength and how long they last — is now 120 on a 30-year running average. Thirty years ago, it was in the 70s or 80s on average. Some people argue the increase is due to unchecked coastal development, while others will point to man-made climate change from the burning of coal, oil and gas. In fact, both are responsible, said former Federal Emergency Management Agency chief Craig Fugate. “There’s a lot of factors going on,” he said. When it comes to hurricane risk, a major factor is “the amount of stuff in the way of natural peril and the vulnerability of the stuff in the way,” said Mark Bove, a meteorologist who works for the insurance firm Munich Re U.S. One factor that increases the possibility that there will be “stuff in the way” of a major storm is that federal disaster policy and flood insurance subsidize and encourage people to rebuild in risky areas, Fugate said. After storms, communities “always say they are going to rise from the ashes,” and, too often, they build the same way in the same place for the same vulnerability and the same outcome, Fugate said. In addition, some places, like Houston, don’t limit development in areas that could serve as flood control zones if left empty and allow development that’s not disaster resilient, said Kathleen Tierney, former director of the Natural Hazards Center at Colorado University. Now add in the meteorology. Scientists agree that waters are warming, and that serves as hurricane fuel, said NOAA climate scientist Jim Kossin. A study by Kossin found that, once a storm formed, the chances of its attaining major storm status globally increased by 8% a decade since 1979. In the Atlantic, chances went up by 49% a decade. But scientists disagree on why waters are warming. They know climate change is a factor — but they say it’s not the biggest driver and disagree on what else may be behind it. Some argue it’s because of a 25- to 30-year natural global cycle that acts like a giant conveyor belt, carrying different levels of salt and temperature around the globe, including into the part of the tropical Atlantic off Africa where the worst hurricanes form, Colorado State University hurricane researcher Phil Klotzbach said. When the water in the northern Atlantic is extra warm, the water in those tropical hurricane breeding grounds is unusually hot, and the hurricane season is abnormally active, Klotzbach said. Such a busy period started in 1995 and might end soon as northern Atlantic waters shift to a cooler regime, he said. Klotzbach acknowledged that one problem with this theory is that the waters in the northern Atlantic have been unusually cool this summer, and still there have been lots of storms. It may have been a blip, he said. But MIT meteorology professor Kerry Emanuel says it’s because another counterintuitive factor is at play: There are more storms because of cleaner air. European air pollution cooled the area over Africa in the 1960s and 1970s and put more dust into the air — both of which tamped down on any hurricanes, he said. When the pollution eased, Africa got warmer, more storms developed, and that’s why it’s such a busy period, Emanuel said. While climate change is not the most important factor in warming waters, it contributes to creating more damaging storms in other ways, by causing a rising sea level that worsens storm surges and making storms move more slowly and produce more rain, scientists say. All of this means that we should get used to more catastrophic storms, according to Munich Re’s Bove. In addition, he said: “Climate change will be a bigger driver of losses in the future.”")
<tf.Tensor: shape=(1, 800), dtype=float32, numpy=
array([[ 0.67777956,  1.9491149 ,  0.3388513 ,  2.0435412 ,  4.2271314 ,
         ...
         1.8406105 ,  2.0213082 ,  0.        ,  0.29909152,  0.        ]],
      dtype=float32)>

GPT2 Model

The GPT2 model applies huggingface's implementation of Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever: "Language Models are Unsupervised Multitask Learners", a large transformer-based language model which embeds an entire document into a vector, usually to perform various natural language processing tasks. The model is trained using [HYPERPARAMETERS], and is saved on our server.

Once a new article is fed into the GPT2 model, the article gets embedded through the saved model. The output is then run through a deep neural network to produce the predicted tags for the article.

N GB of model files will be downloaded, M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.gpt2.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

>>> from mitnewsclassify import gpt2

>>> gpt2.gettags("Republicans proceeded with the third night of their national convention, but many Americans — particularly those in the path of Hurricane Laura — were focused on more immediate concerns.")
['politics and government', 'motion pictures', 'budgets and budgeting', 'medicine and health', 'presidents and presidency (us)', 'immigration and refugees']

>>> gpt2.gettags("The Milwaukee Bucks chose to boycott their Wednesday matchup against the Orlando Magic in protest of the police shooting of Jacob Blake, a 29-year-old Black man, in Wisconsin.")
['recordings (video)', 'hurricanes and tropical storms']

>>> gpt2.gettags("One superspreading event may be connected to about 20,000 Covid-19 cases in the Boston area, a researcher said on Tuesday. That event, a biotech conference attended by 200 people in late February, is now well known as a source of Covid-19 spread very early on in the pandemic. Here is how a virus spreads Here is how a virus spreads 01:45 'Ultimately, more than 90 cases were diagnosed in people associated with this conference or their contacts, raising suspicion that a superspreading event had occurred there,' the researchers wrote in their study. Superspreading occurs when one or a few infected people cause a cascade of transmissions of an infectious disease. The new study -- which has not yet been peer-reviewed but was posted to the online server medrxiv.org on Tuesday -- involved analyzing the impact of early superspreading events in the Boston area and provided 'direct evidence' that superspreading can profoundly alter the course of an epidemic. 'An unfortunate perfect storm' The researchers -- from the Broad Institute of MIT and Harvard in Cambridge and other various institutions -- conducted genetic analyses of coronavirus specimen samples in Massachusetts. The researchers sequenced and analyzed 772 complete genomes of the virus from the region. They found 80 introductions of the virus into the Boston, predominantly from elsewhere in the United States and Europe, and 'hundreds of cases from major outbreaks' in various settings, including the conference. Coronavirus quickly spread around the world starting late last year, new genetic analysis shows Coronavirus quickly spread around the world starting late last year, new genetic analysis shows The conference, held from February 26 to 27, was a 'perfect storm' and the superspreading there could have been connected to approximately 20,000 cases, Bronwyn MacInnis, a researcher at the Broad Institute who worked on the study, told CNN in an email on Tuesday. 'Many factors made the conference an unfortunate perfect storm as a superspreading event. That the virus was introduced at the conference at all was unlucky,' MacInnis wrote in the email. 'This is not a rigorous estimate but does communicate the scale,' MacInnis added. 'If tens of thousands of individuals seems large, it is important to point out that it is in context of a pandemic that has infected tens of millions of people.' Unseen Covid-19 cases began early, spread fast Unseen Covid-19 cases began early, spread fast 03:00 Timing was crucial. In late February, people were not yet aware of the pandemic risk. 'When it happened was critical: it was scheduled just as we were collectively beginning to appreciate the imminent threat of COVID at home--if it had been a week later the event likely would have been cancelled,' MacInnis wrote in the email. 'Also, because it happened early in the epidemic it had the chance to spread widely before extensive testing capacity, shutdowns, social distancing, and masking were in place,' she wrote. 'The other critical factor was the population the virus landed in: people who had come from many different places (including some where COVID was already circulating), and who then returned home, often unknowingly bringing the virus with them.' 'A much greater understanding of how easily and quickly this virus can be transmitted' While the researchers did not identify the conference in their study, The Boston Globe on Tuesday said it was an international meeting of leaders from the biotechnology company Biogen at the Marriott Long Wharf hotel in Boston. How 53 members of this choir were infected in &#39;super spreader&#39; event How 53 members of this choir were infected in 'super spreader' event 03:03 'February 2020 was nearly a half year ago, and was a period when general knowledge about the coronavirus was limited,' Biogen said in a written statement to CNN on Tuesday. 'We were adhering closely to the prevailing official guidelines. We never would have knowingly put anyone at risk. When we learned a number of our colleagues were ill, we did not know the cause was COVID-19, but we immediately notified public health authorities and took steps to limit the spread.' The company noted in its statement that it joined a collaboration with the Broad Institute in April to share biological and medical data to advance knowledge around Covid-19. 'The world today has a much greater understanding of how easily and quickly this virus can be transmitted, and we are proud to contribute through this collaboration to the global effort to overcome COVID-19,' it said. Who or what is a super spreader? Dr. Sanjay Gupta&#39;s coronavirus podcast for June 18 explains. Who or what is a super spreader? Dr. Sanjay Gupta's coronavirus podcast for June 18 explains. Massachusetts Governor Charlie Baker said in a news conference on Tuesday that he saw the Biogen conference in February as a 'seminal event' in the coronavirus pandemic for the Boston area. 'I was criticized actually for saying a few months ago that the Biogen event was a seminal event with respect to corona here in the Commonwealth and I couldn't put a number on it at that point in time,' Baker said. 'This is no offense to anybody, but at that point in time, nobody was wearing masks, nobody was social distancing, nobody was even behaving with concern about the presence of the virus at all. I mean all rules of the game with respect to that have changed,' Baker said. 'It speaks to the power of that virus to move from one person to another to another.' Get CNN Health's weekly newsletter Sign up here to get The Results Are In with Dr. Sanjay Gupta every Tuesday from the CNN Health team. The new pre-print study also investigated the spread of the coronavirus in other settings across the Boston area, including a skilled nursing facility -- where 85% of residents and 37% of staff tested positive -- and a homeless shelter -- where the coronavirus was introduced seven times, including four that resulted in clusters of cases, according to the study. 'Our findings repeatedly highlight the close relationships between seemingly disconnected groups and populations: viruses from international business travel seeded major outbreaks among individuals experiencing homelessness, spread throughout the Boston area, and were exported to other domestic and international sites,' the researchers wrote in the study.")
['elections', 'firearms']

>>> gpt2.gettags("Two people died and one person was injured as shots were fired late Tuesday in Kenosha during the third night of unrest in Wisconsin following the shooting of a Black man by police, Kenosha police said. The shooting was reported at about 11:45 p.m. in an area where protests have taken place, Kenosha police Lt. Joseph Nosalik said in a news release. Kenosha County Sheriff David Beth said one victim had been shot in the head and another in the chest late Tuesday, just before midnight, according to the Milwaukee Journal Sentinel. Beth didn’t know where the other person was shot, but his or her injuries are not believed to be life threatening. The shooting was under investigation and no other information was released. The victims have not been identified. Jacob Blake, who was shot shot multiple times by police in Wisconsin, is paralyzed, and it would “take a miracle” for him to walk again, his family’s attorney said Tuesday, while calling for the officer who opened fire to be arrested and others involved to lose their jobs. The shooting of Blake on Sunday in Kenosha — apparently in the back while three of his children looked on — was captured on cellphone video and ignited new protests over racial injustice in several cities, coming just three months after the death of George Floyd at the hands of Minneapolis police touched off a wider reckoning on race. Earlier Tuesday, Blake’s father spoke alongside other family members and lawyers, telling reporters that police shot his son “seven times, seven times, like he didn’t matter.” “But my son matters. He’s a human being and he matters,” said Blake’s father, who is also named Jacob Blake. The 29-year-old was in surgery Tuesday, said attorney Ben Crump, adding that the bullets severed Blake’s spinal cord and shattered his vertebrae. Another attorney said there was also severe damage to organs. “It’s going to take a miracle for Jacob Blake Jr. to ever walk again,” Crump said. The legal team plans to file a civil lawsuit against the police department over the shooting. Police have said little about what happened, other than that they were responding to a domestic dispute. The officers involved have not been named. The Wisconsin Department of Justice is investigating. Police fired tear gas for a third night Tuesday to disperse protesters who had gathered outside Kenosha’s courthouse, where some shook a protective fence and threw water bottles and fireworks at officers lined up behind it. Police then used armored vehicles and officers with shields pushed back the crowd when protesters ignored warnings to leave a nearby park. Wisconsin Gov. Tony Evers had called for calm Tuesday, while also declaring a state of emergency under which he doubled the National Guard deployment in Kenosha from 125 to 250. The night before crowds destroyed dozens of buildings and set more than 30 fires in the city’s downtown. “We cannot allow the cycle of systemic racism and injustice to continue,” said Evers, who is facing mounting pressure from Republicans over his handling of the unrest. “We also cannot continue going down this path of damage and destruction.” Blake’s mother, Julia Jackson, said the damage in Kenosha does not reflect what her family wants and that, if her son could see it, he would be “very unpleased.” She said the first thing her son said to her when she saw him was he was sorry. “He said, ‘I don’t want to be a burden on you guys,’” Jackson said. “’I want to be with my children, and I don’t think I’ll walk again.’” Three of the younger Blake’s sons — aged 3, 5 and 8 — were in the car at the time of the shooting, Crump said. It was the 8-year-old’s birthday, he added. The man who said he made the cellphone video of the shooting, 22-year-old Raysean White, said he saw Blake scuffling with three officers and heard them yell, “Drop the knife! Drop the knife!” before the gunfire erupted. He said he didn’t see a knife in Blake’s hands. In the footage, Blake walks from the sidewalk around the front of his SUV to his driver-side door as officers follow him with their guns drawn and shout at him. As Blake opens the door and leans into the SUV, an officer grabs his shirt from behind and opens fire. Seven shots can be heard, though it isn’t clear how many struck Blake or how many officers fired. Blake’s father told the Chicago Sun-Times that his son had eight holes in his body. Anger over the shooting has spilled into the streets of Kenosha and other cities, including Los Angeles, Wisconsin’s capital of Madison and in Minneapolis, the epicenter of the Black Lives Matter movement this summer following Floyd’s death. Hundreds of people again defied curfew Tuesday in Kenosha, where destruction marred protests the previous night as fires were set and businesses vandalized. There were 34 fires associated with that unrest, with 30 businesses destroyed or damaged along with an unknown number of residences, Kenosha Fire Chief Charles Leipzig told the Kenosha News. “Nobody deserves this,” said Pat Oertle, owner of Computer Adventure, surveying the damage on Tuesday. Computers were stolen, and the store was “destroyed,” she said. “This accomplishes nothing,” Oertle said. “This is not justice that they’re looking for.” U.S. Sen. Ron Johnson and U.S. Rep. Bryan Steil, both Republicans, called on the governor to do more to quell the unrest. Steil said he would request federal assistance if necessary. Evers continued to call for protesters to be peaceful. “Please do not allow the actions of a few distract us from the work we must do together to demand justice, equity, and accountability,” he said. Blake’s family also called for calm. “I really ask you and encourage everyone in Wisconsin and abroad to take a moment and examine your hearts,” Blake’s mother said. “Do Jacob justice on this level and examine your hearts. … As I pray for my son’s healing physically, emotionally and spiritually, I also have been praying even before this for the healing of our country.”")
['homosexuality', 'computer software', 'decisions and verdicts', 'libraries and librarians', 'presidential election of 2004', 'computer and video games', 'serial murders']

getfeatures

class mitnewsclassify.gpt2.getfeatures(txt)

Gets the output from the saved GPT2 model before it gets fed into the neural network to obtain a classification for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • features (Tensor[float32]) - The 768 values

Example

>>> from mitnewsclassify import gpt2

>>> gpt2.getfeatures("A destructive storm is rising from warm waters. Again. America and the world are getting more frequent and bigger multibillion dollar tropical catastrophes like Hurricane Laura, which is menacing the U.S. Gulf Coast, because of a combination of increased coastal development, natural climate cycles, reductions in air pollution and man-made climate change, experts say. The list of recent whoppers keeps growing: Harvey, Irma, Maria, Florence, Michael, Dorian. And hurricane experts have no doubt that Laura will be right there with them. It’s a mess at least partially of our own making, said Susan Cutter, director of the Hazards and Vulnerability Institute at the University of South Carolina. “We are seeing an increase of intensity of these phenomena because we as a society are fundamentally changing the Earth and at the same time we are moving to locations that are more hazardous,” Cutter said Wednesday. In the last three years, the United States has had seven hurricane disasters that each caused at least $1 billion in damage, totaling $335 billion. In all of the 1980s, there were six, and their damage totaled $38.2 billion, according to the National Oceanic and Atmospheric Administration. All those figures are adjusted for the cost of living. The Atlantic is increasingly spawning more major hurricanes, according to an Associated Press analysis of NOAA hurricane data since 1950. That designation refers to storms with at least 111-mile-per-hour (179-kilometer-per-hour) winds that are the ones that do the most damage. The Atlantic now averages three major hurricanes a year, based on a 30-year running average. In the 1980s and 1990s, it was two. The Atlantic’s Accumulated Cyclone Energy — a measurement that takes into account the number of storms, their strength and how long they last — is now 120 on a 30-year running average. Thirty years ago, it was in the 70s or 80s on average. Some people argue the increase is due to unchecked coastal development, while others will point to man-made climate change from the burning of coal, oil and gas. In fact, both are responsible, said former Federal Emergency Management Agency chief Craig Fugate. “There’s a lot of factors going on,” he said. When it comes to hurricane risk, a major factor is “the amount of stuff in the way of natural peril and the vulnerability of the stuff in the way,” said Mark Bove, a meteorologist who works for the insurance firm Munich Re U.S. One factor that increases the possibility that there will be “stuff in the way” of a major storm is that federal disaster policy and flood insurance subsidize and encourage people to rebuild in risky areas, Fugate said. After storms, communities “always say they are going to rise from the ashes,” and, too often, they build the same way in the same place for the same vulnerability and the same outcome, Fugate said. In addition, some places, like Houston, don’t limit development in areas that could serve as flood control zones if left empty and allow development that’s not disaster resilient, said Kathleen Tierney, former director of the Natural Hazards Center at Colorado University. Now add in the meteorology. Scientists agree that waters are warming, and that serves as hurricane fuel, said NOAA climate scientist Jim Kossin. A study by Kossin found that, once a storm formed, the chances of its attaining major storm status globally increased by 8% a decade since 1979. In the Atlantic, chances went up by 49% a decade. But scientists disagree on why waters are warming. They know climate change is a factor — but they say it’s not the biggest driver and disagree on what else may be behind it. Some argue it’s because of a 25- to 30-year natural global cycle that acts like a giant conveyor belt, carrying different levels of salt and temperature around the globe, including into the part of the tropical Atlantic off Africa where the worst hurricanes form, Colorado State University hurricane researcher Phil Klotzbach said. When the water in the northern Atlantic is extra warm, the water in those tropical hurricane breeding grounds is unusually hot, and the hurricane season is abnormally active, Klotzbach said. Such a busy period started in 1995 and might end soon as northern Atlantic waters shift to a cooler regime, he said. Klotzbach acknowledged that one problem with this theory is that the waters in the northern Atlantic have been unusually cool this summer, and still there have been lots of storms. It may have been a blip, he said. But MIT meteorology professor Kerry Emanuel says it’s because another counterintuitive factor is at play: There are more storms because of cleaner air. European air pollution cooled the area over Africa in the 1960s and 1970s and put more dust into the air — both of which tamped down on any hurricanes, he said. When the pollution eased, Africa got warmer, more storms developed, and that’s why it’s such a busy period, Emanuel said. While climate change is not the most important factor in warming waters, it contributes to creating more damaging storms in other ways, by causing a rising sea level that worsens storm surges and making storms move more slowly and produce more rain, scientists say. All of this means that we should get used to more catastrophic storms, according to Munich Re’s Bove. In addition, he said: “Climate change will be a bigger driver of losses in the future.”")
tensor([[ 1.2303e-01,  4.3685e-01,  2.6451e-01, -3.6422e-01, -4.5769e-01,
          ...
          6.5585e-01,  6.0992e-01, -1.5914e-01]])

DistilBERT Model

The DistilBERT model applies huggingface's implementation of Victor Sanh, Lysandre Debut, Julien Chaumond and Thomas Wolf: “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter”, a large transformer-based language model which embeds an entire document into a vector, usually to perform various natural language processing tasks. The model is trained using [HYPERPARAMETERS], and is saved on our server.

Once a new article is fed into the DistilBERT model, the article gets embedded through the saved model. The output is then run through a deep neural network to produce the predicted tags for the article. Due to the limits of time, our package does not support gettags at the moment.

N GB of model files will be downloaded, M MB of RAM will be used on average to run the model and the classification takes P ms.

getfeatures

class mitnewsclassify.distilbert.getfeatures(txt)

Gets the values of the second layer in the neural network for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • features (Tensor[float32]) - The 768 values

Example

>>> from mitnewsclassify import distilbert

>>> distilbert.getfeatures("A destructive storm is rising from warm waters. Again. America and the world are getting more frequent and bigger multibillion dollar tropical catastrophes like Hurricane Laura, which is menacing the U.S. Gulf Coast, because of a combination of increased coastal development, natural climate cycles, reductions in air pollution and man-made climate change, experts say. The list of recent whoppers keeps growing: Harvey, Irma, Maria, Florence, Michael, Dorian. And hurricane experts have no doubt that Laura will be right there with them. It’s a mess at least partially of our own making, said Susan Cutter, director of the Hazards and Vulnerability Institute at the University of South Carolina. “We are seeing an increase of intensity of these phenomena because we as a society are fundamentally changing the Earth and at the same time we are moving to locations that are more hazardous,” Cutter said Wednesday. In the last three years, the United States has had seven hurricane disasters that each caused at least $1 billion in damage, totaling $335 billion. In all of the 1980s, there were six, and their damage totaled $38.2 billion, according to the National Oceanic and Atmospheric Administration. All those figures are adjusted for the cost of living. The Atlantic is increasingly spawning more major hurricanes, according to an Associated Press analysis of NOAA hurricane data since 1950. That designation refers to storms with at least 111-mile-per-hour (179-kilometer-per-hour) winds that are the ones that do the most damage. The Atlantic now averages three major hurricanes a year, based on a 30-year running average. In the 1980s and 1990s, it was two. The Atlantic’s Accumulated Cyclone Energy — a measurement that takes into account the number of storms, their strength and how long they last — is now 120 on a 30-year running average. Thirty years ago, it was in the 70s or 80s on average. Some people argue the increase is due to unchecked coastal development, while others will point to man-made climate change from the burning of coal, oil and gas. In fact, both are responsible, said former Federal Emergency Management Agency chief Craig Fugate. “There’s a lot of factors going on,” he said. When it comes to hurricane risk, a major factor is “the amount of stuff in the way of natural peril and the vulnerability of the stuff in the way,” said Mark Bove, a meteorologist who works for the insurance firm Munich Re U.S. One factor that increases the possibility that there will be “stuff in the way” of a major storm is that federal disaster policy and flood insurance subsidize and encourage people to rebuild in risky areas, Fugate said. After storms, communities “always say they are going to rise from the ashes,” and, too often, they build the same way in the same place for the same vulnerability and the same outcome, Fugate said. In addition, some places, like Houston, don’t limit development in areas that could serve as flood control zones if left empty and allow development that’s not disaster resilient, said Kathleen Tierney, former director of the Natural Hazards Center at Colorado University. Now add in the meteorology. Scientists agree that waters are warming, and that serves as hurricane fuel, said NOAA climate scientist Jim Kossin. A study by Kossin found that, once a storm formed, the chances of its attaining major storm status globally increased by 8% a decade since 1979. In the Atlantic, chances went up by 49% a decade. But scientists disagree on why waters are warming. They know climate change is a factor — but they say it’s not the biggest driver and disagree on what else may be behind it. Some argue it’s because of a 25- to 30-year natural global cycle that acts like a giant conveyor belt, carrying different levels of salt and temperature around the globe, including into the part of the tropical Atlantic off Africa where the worst hurricanes form, Colorado State University hurricane researcher Phil Klotzbach said. When the water in the northern Atlantic is extra warm, the water in those tropical hurricane breeding grounds is unusually hot, and the hurricane season is abnormally active, Klotzbach said. Such a busy period started in 1995 and might end soon as northern Atlantic waters shift to a cooler regime, he said. Klotzbach acknowledged that one problem with this theory is that the waters in the northern Atlantic have been unusually cool this summer, and still there have been lots of storms. It may have been a blip, he said. But MIT meteorology professor Kerry Emanuel says it’s because another counterintuitive factor is at play: There are more storms because of cleaner air. European air pollution cooled the area over Africa in the 1960s and 1970s and put more dust into the air — both of which tamped down on any hurricanes, he said. When the pollution eased, Africa got warmer, more storms developed, and that’s why it’s such a busy period, Emanuel said. While climate change is not the most important factor in warming waters, it contributes to creating more damaging storms in other ways, by causing a rising sea level that worsens storm surges and making storms move more slowly and produce more rain, scientists say. All of this means that we should get used to more catastrophic storms, according to Munich Re’s Bove. In addition, he said: “Climate change will be a bigger driver of losses in the future.”")
tensor([[ 1.3516e-02, -6.6536e-02, -1.2586e-01,  7.8865e-02,  1.5436e-01,
          ...
          4.4400e-02,  1.0047e-01, -8.2776e-02]])

Ensemble Model (TF-IDF + TF-IDF Bigrams)

The Ensemble model combines the feature values outputted from two other models, the TF-IDF model and the TF-IDF Bigrams model, in order to obtain a more accurate classification better than each model by itself.

Once a new article is fed into the Ensemble model, the article gets run through both models, and their outputs are concatenated and then run through two layers of neural network (size 1600 and 800 respectively) to produce the predicted tags for the article.

N MB of model files will be downloaded (not including the two other models), M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.ensemble.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

>>> from mitnewsclassify import ensemble

Trisemble Model (TF-IDF + TF-IDF Bigrams + Doc2Vec)

The Trisemble model combines the feature values outputted from three other models, the TF-IDF model, the TF-IDF Bigrams model and the Doc2Vec model, in order to obtain a more accurate classification better than each model by itself.

Once a new article is fed into the Trisemble model, the article gets run through all three models, and their outputs are concatenated and then run through two layers of neural network (size 1600 and 800 respectively) to produce the predicted tags for the article.

N MB of model files will be downloaded (not including the three other models), M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.trisemble.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

Quadsemble Model (TF-IDF + TF-IDF Bigrams + Doc2Vec + DistilBERT)

The Quadsemble model combines the feature values outputted from two other models, the TF-IDF model, the TF-IDF Bigrams model, the Doc2Vec model and the DistilBERT model, in order to obtain a more accurate classification better than each model by itself.

Once a new article is fed into the Ensemble model, the article gets run through all four models, and their outputs are concatenated and then run through two layers of neural network (size 1600 and 800 respectively) to produce the predicted tags for the article.

N MB of model files will be downloaded (not including the two other models), M MB of RAM will be used on average to run the model and the classification takes P ms.

gettags

class mitnewsclassify.quadsemble.gettags(txt)

Gets the predicted tags for a given article text.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Example

Download Class

The Download class is a supplement to all the models, and is intrinsically called upon by each model if the model files have not been downloaded onto the user's storage. Should there be any problems during the intrinsic calls, the user can call the download class explicitly to get the same desired outcome.

download

class mitnewsclassify.download.download(model=None)

Downloads the model files for a given model.

Parameters

  • model (string) - The model to be downloaded onto the user's storage. Default value is None and all models will be downloaded. Use tfidf for the TF-IDF model, tfidf_bi for the TF-IDF Bigrams model, doc2vec for the Doc2Vec model, gpt2 for the GPT2 model, distilbert for the DistilBERT model, ensemble for the Ensemble model, trisemble for the trisemble model and quadsemble for the quadsemble model.

Example

>>> from mitnewsclassify import download

>>> download.download('tfidf')

Aliases

For ease of use, the user can also call on models by their aliases. This part was suggested by Prof. Max Tegmark.

lite

class mitnewsclassify.lite(txt)

Calls mitnewsclassify.ensemble.gettags.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

medium

class mitnewsclassify.medium(txt)

Calls mitnewsclassify.trisemble.gettags.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

lite

class mitnewsclassify.badass(txt)

Calls mitnewsclassify.quadsemble.gettags.

Parameters

  • txt (string) - The article text

Returns

  • tags (List[str]) - The predicted tags

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for mit-news-classify, version 0.8.2
Filename, size File type Python version Upload date Hashes
Filename, size mit_news_classify-0.8.2-py3-none-any.whl (28.1 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size mit-news-classify-0.8.2.tar.gz (35.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page