Text Summarization App with Flask, Sumy, and W3.CSS

Alamin Musa Magaga
8 min readSep 15, 2021

advance text/document summarization web app with Sumy package.

Photo by Claudio Testa on Unsplash

Today there is a huge collection of information, documents, and blogs, but some of them contain highly repetitive, obscure, and discursive texts giving the reader more time to read, analyze and draw meaningful inferences from those texts.

the advancement of AI and natural language processing tools provide the state-of-art requirement for easier text preprocessing and summarization techniques.

text summarization is used in different fields and devices from our text editors to our mobile phones to shortened long sentences into a more concise format.

Text summarization is the process of making texts or documents more simplified and concise for easier understanding and use

In this tutorial, I will show a step-by-step approach to creating a functional text summarization app.

Project Content

  • Sumy package that was used in the app
  • building our app
  • testing and running the app
  • Conclusion

Sumy package

Sumy is a special package for performing text summarization tasks and offers remarkable tools such as LexRank, heuristic method(Luhn), Latent Semantic Analysis method (LSA) which we are going to use and incorporate with our text summarization application.

To get started with this package, run the script below in the command prompt, but if you have already installed it you can skip to the next stage.

pip install sumy

Building the app

the app development consists of two stages:
1-Frontend
2-Backend
we are going to build our front end with HTML and W3.CSS(modern CSS framework) and the backend with Flask framework.

Front end
Here, we are going to create two HTML files; the index.html and result.html, the index.html is the homepage that contains the input of the application, the result.html output the summarized text while the static folder contains the W3.CSS file for styling the app.

index.html

<!DOCTYPE html>
<html>
<head>
<title>Summaryzer</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
<!-- linking to the external styling file-->
<link rel="stylesheet" href="{{ url_for('static', filename='w3.css') }}">
</style>
</head>
<body>
<!-- Navigation Bar-->
<div class="w3-bar w3-teal w3-padding">
<a href="{{url_for('index')}}" class="w3-bar-item w3-button w3-padding ">Home</a>

<a href="#C1" class="w3-bar-item w3-button w3-right w3-padding" >About</a>
<a href="#C1" class="w3-bar-item w3-button w3-right w3-padding" >Contact</a>
</div>
<br>
<!-- Start of Main Section -->
<div class="w3-container w3-center"><h3>Enter your text Here</h3>
<!-- text input-->
<form method="POST" action="/process">
<textarea name="input_text" cols="5" rows="10" required="true" placeholder="Enter Text Here" style="width:60%"></textarea>

<br/>
<!-- select the model choice-->
<p><h4>Select your Algorithm</h4></p>
<select class="w3-select w3-border w3-round-large" name="model_choice" style="width:40%">


<option value="default" selected>Default</option>
<option value="lex_summarizer">Lex Summarizer</option>
<option value="luhn_summarizer">Luhn Summarizer</option>
<option value="isa_summarizer">Isa Summarizer</option>

</select>
<br>
<br>
<button class="w3-btn w3-blue-grey" type="reset" value="reset">Clear</button>
<button class="w3-btn w3-teal" type="submit" value="reset">Summarize</button>

</div>
</form>
</div>
<br><br>
<!-- url link of the text document-->
<div class="w3-container w3-center"><h3>Enter your URL link Here</h3>
<form method="POST" action="/process_url">
<input type="text" name="input_url" placeholder="Enter URL Here" required="true" style="width:50%">
<br><br>
<button class="w3-btn w3-blue-grey" type="reset" value="reset">Clear</button>
<button class="w3-btn w3-teal" type="submit" value="reset">Summarize</button>
</form>
</div>
<br><br>
</div>
</div>
<!-- footer -->
<div class="w3-container w3-teal">
<div class="w3-row"><div class="w3-half w3-container">
<h4 >About the App</h4>
<p>the app was build with Flask framework,Sumy packages(Lex rank,Isa and luhn) and w3.css(modern css framework) for the application
.</p>
Made by <a href="https://twitter.com/Alamin_Magaga">Alamin M Magaga</a><br/>
Founder<a href="https://twitter.com/Magtech_Dihub"> @Magtech Digital Hub</a>
</div>

<div class="w3-half w3-container w3-center">
<h4 id="C1">Connect With Me</h4>
<ul>
<a href="https://web.facebook.com/alaminmusa.magaga" target="_blank" class="white-text">
<i class="fa fa-facebook fa-4x"></i>

</a>
<a href="https://www.youtube.com/channel/UCy3KM0QxpZUTgKjXRVID6-A" target="_blank" >
<i class="fa fa-youtube-square fa-4x"></i>

</a>

<a href="alaminmusamagaga.medium.com" target="_blank" class="white-text">
<i class="fa fa-medium fa-4x"></i>

</a>

<a href="https://www.linkedin.com/in/alamin-musa-magaga-8b388118b" target="_blank" class="white-text">
<i class="fa fa-linkedin fa-4x"></i>
</a>

</a>
<a href="https://github.com/alaminmagaga" target="_blank" class="white-text">
<i class="fa fa-github-square fa-4x"></i>
</a>

<a href="https://twitter/Alamin_Magaga" target="_blank" class="white-text">
<i class="fa fa-twitter fa-4x"></i>

</a>

</ul>
</div>
</div>
<br>
</div>
</body>
</html>

we style the index.html file externally by linking it to a styling file below

<link rel=”stylesheet” href=”{{ url_for(‘static’, filename=’w3.css’) }}”>

the form action send the text input data to a function in the app.py file.

<form method=”POST” action=”/process”>

the text area is the user input of the application.

<textarea name=”rawtext” cols=”5" rows=”10" required=”true” placeholder=”Enter Text Here” style=”width:60%”></textarea>

for the button types reset and submit, the latter is used for resetting back to the initial stage and the former is for submission of the input

<button class=”w3-btn w3-blue-grey” type=”reset” value=”reset”>Clear</button>
<button class=”w3-btn w3-blue” type=”submit” value=”reset”>Summarize</button>

result.html

the result.html file contains the variables that output the summarized text for the app.

<!DOCTYPE html>
<html >
<head>
<meta charset="UTF-8">
<title>prediction</title>
<!-- linking the index.html file to the styling folder-->
<link rel="stylesheet" href="{{ url_for('static', filename='w3.css') }}">
<style type="text/css"></style>
</head>
<body><div class='w3-padding w3-teal w3-center'><h1>Text Summarization Result</h1></div>

</form>
</div>
</div><div class="w3-container w3-padding-24"><div class="w3-container w3-card">
<div class="w3-row">
<div class="w3-half w3-container ">
<h5>Original Text</h5>
<p>Reading Time: {{ final_reading_time }} minute </p>
<p >{{ctext}}</p>

</div>
<div class="w3-half w3-container w3-light-grey">
<h5>Summarized Text</h5>

<p>Reading Time: <span class="w3-text-teal">{{ summary_reading_time }} minute </span></p>

<p>{{ final_summary }}</p>


</div>
</div>
</div>
</div></div><!-- back button-->
<div class="w3-center">
<a href="{{url_for('index')}}" style="text-decoration:none">
<button class="w3-btn w3-teal w3-round-large" style="width:10%;text-decoration:none;" type="submit" value="reset">Back</button></a></div>
<br>
</div></body>
</html>

this script below shows variables of the total reading time variable of the original text and the input text variable of the application

<p>Reading Time: {{ final_reading_time }} min </p>
<p >{{ctext}}</p>

this script below also indicate the variables for the total reading time of the summarized text and summarized text of the application

<p>Reading Time: {{ summary_reading_time }} min </p>
</div>
<p>{{ final_summary }}</p>

Backend

flask framework provides simplified access to create machine learning and data science app with flexibility, tools, and documentation that makes it easier and proficient in testing, running and debugging codes.

Photo by Monocubed

we are going to create an app.py file by using the necessary python libraries and tools such as flask, Sumy packages to write a code that will summarize texts, sentences, and documents into more concise and precise texts.

app.py

#code
from flask import Flask,render_template,url_for,request
import time
import spacy
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lex_rank import LexRankSummarizer
from sumy.summarizers.luhn import LuhnSummarizer
from sumy.summarizers.lsa import LsaSummarizer
from bs4 import BeautifulSoup
from urllib.request import urlopen,Request
nlp = spacy.load("en_core_web_sm")app = Flask(__name__)def lex_summary(docx):
parser = PlaintextParser.from_string(docx,Tokenizer("english"))
lex_summarizer = LexRankSummarizer()
summary = lex_summarizer(parser.document,3)
summary_list = [str(sentence) for sentence in summary]
result = ' '.join(summary_list)
return result
def luhn_summary(docx):
parser = PlaintextParser.from_string(docx,Tokenizer("english"))
summarizer_luhn = LuhnSummarizer()
summary_1 =summarizer_luhn(parser.document,3)
summary_list = [str(sentence) for sentence in summary_1]
result = ' '.join(summary_list)
return result
def isa_summary(docx):
parser = PlaintextParser.from_string(docx,Tokenizer("english"))
summarizer_lsa = LsaSummarizer()
summary_2 =summarizer_lsa(parser.document,3)
summary_list = [str(sentence) for sentence in summary_2]
result = ' '.join(summary_list)
return result
# Reading Time
def readingTime(mytext):
total_words = len([ token.text for token in nlp(mytext)])
estimatedTime = total_words/200.0
return estimatedTime
@app.route('/')
def index():
return render_template('index.html')
@app.route('/process',methods=['GET','POST'])
def process():
start = time.time()
if request.method == 'POST':
input_text = request.form['input_text']
model_choice = request.form['model_choice']
final_reading_time = readingTime(input_text)
if model_choice == 'default':
final_summary = lex_summary(input_text)
elif model_choice == 'lex_summarizer':
final_summary = lex_summary(input_text)
elif model_choice == 'luhn_summarizer':
final_summary= luhn_summary(input_text)
elif model_choice == 'isa_summarizer':
final_summary= isa_summary(input_text)
summary_reading_time = readingTime(final_summary)
end = time.time()
final_time = end-start
return render_template('result.html',ctext=input_text,final_reading_time=final_reading_time,summary_reading_time=summary_reading_time,final_summary=final_summary,model_selected=model_choice)
def get_text(url):
reqt = Request(url,headers={'User-Agent' : "Magic Browser"})
page = urlopen(reqt)
soup = BeautifulSoup(page)
fetched_text = ' '.join(map(lambda p:p.text,soup.find_all('p')))
return fetched_text
@app.route('/process_url',methods=['GET','POST'])
def process_url():
start = time.time()
if request.method == 'POST':
input_url = request.form['input_url']
raw_text = get_text(input_url)
final_reading_time = readingTime(raw_text)
final_summary = lex_summary(raw_text)
summary_reading_time = readingTime(final_summary)
end = time.time()
final_time = end-start
return render_template('result.html',ctext=raw_text,
final_summary=final_summary,
final_time=final_time,
final_reading_time=final_reading_time,
summary_reading_time=summary_reading_time)
if __name__ == '__main__':
app.run(debug=True)

Step 1: Import the necessary libraries


from flask import Flask,render_template,url_for,request
import time
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lex_rank import LexRankSummarizer
from sumy.summarizers.luhn import LuhnSummarizer
from sumy.summarizers.lsa import LsaSummarizer

Step 2: Create a function that will summarize the text with three different Sumy Algorithms

def lex_summary(docx):
parser = PlaintextParser.from_string(docx,Tokenizer("english"))
lex_summarizer = LexRankSummarizer()
summary = lex_summarizer(parser.document,3)
summary_list = [str(sentence) for sentence in summary]
result = ' '.join(summary_list)
return result
def luhn_summary(docx):
parser = PlaintextParser.from_string(docx,Tokenizer("english"))
summarizer_luhn = LuhnSummarizer()
summary_1 =summarizer_luhn(parser.document,3)
summary_list = [str(sentence) for sentence in summary_1]
result = ' '.join(summary_list)
return result
def isa_summary(docx):
parser = PlaintextParser.from_string(docx,Tokenizer("english"))
summarizer_lsa = LsaSummarizer()
summary_2 =summarizer_lsa(parser.document,3)
summary_list = [str(sentence) for sentence in summary_2]
result = ' '.join(summary_list)
return result

the code above will summarize text with the use of three Sumy algorithms; LexRank, heuristic method(Luhn), Latent Semantic Analysis method (LSA)

Step 3: Calculate the total reading time


def readingTime(mytext):
total_words = len([ token.text for token in nlp(mytext)])
estimatedTime = total_words/200.0
return estimatedTime

this function calculates the estimated time taken required to read the text document

Step 4. Defining a function for returning the summarized text

<!-- this will render the template folder to index.html as the homepage -->
@app.route('/')
def index():
return render_template('index.html')
<!-- recieve text input and summaize it with the chosen algorithm -->
@app.route('/process',methods=['GET','POST'])
def process():
start = time.time()
if request.method == 'POST':
input_text = request.form['input_text']
model_choice = request.form['model_choice']
final_reading_time = readingTime(input_text)
if model_choice == 'default':
final_summary = lex_summary(input_text)
elif model_choice == 'lex_summarizer':
final_summary = lex_summary(input_text)
elif model_choice == 'luhn_summarizer':
final_summary= luhn_summary(input_text)
elif model_choice == 'isa_summarizer':
final_summary= isa_summary(input_text)
summary_reading_time = readingTime(final_summary)
end = time.time()
final_time = end-start
return render_template('result.html',ctext=input_text,final_reading_time=final_reading_time,summary_reading_time=summary_reading_time,final_summary=final_summary,model_selected=model_choice)

the function process will take both the text input from the form and also the choice of the algorithm that will be used for the summarization and returns the summarized version of the text with the estimated reading time in the result.html file.

step 5: Create a function that will fetch texts document from the URL

def get_text(url):
reqt = Request(url,headers={'User-Agent' : "Magic Browser"})
page = urlopen(reqt)
soup = BeautifulSoup(page)
fetched_text = ‘ ‘.join(map(lambda p:p.text,soup.find_all(‘p’)))
return fetched_text

Step 6: Create a function that returns the summarized text of the text URL

@app.route('/process_url',methods=['GET','POST'])
def process_url():
start = time.time()
if request.method == 'POST':
input_url = request.form['input_url']
raw_text = get_text(input_url)
final_reading_time = readingTime(raw_text)
final_summary = lex_summary(raw_text)
summary_reading_time = readingTime(final_summary)
end = time.time()
final_time = end-start
return render_template('result.html',ctext=raw_text,
final_summary=final_summary,
final_time=final_time,
final_reading_time=final_reading_time,
summary_reading_time=summary_reading_time)

the function get_text will fetch text document from the link URL while the function process_url will take the input, summarize it with the chosen model, and then returns the summarized text and the estimated reading time in the result.html file.

Testing and Running the App

navigate to the project directory from your command or anaconda prompt and run the command below:

python app.py

copy and paste this URL:http://127.0.0.1:5000/ in your browser to access the app.

photo by magtech digital hub

4-Conclusion

Congratulations, at the end of this project we have created three Sumy algorithms that will summarize texts document and we also incorporate w3.css(modern CSS framework) with the app to make it more presentable and interactive.

you can download the complete project from the Link

--

--

Alamin Musa Magaga

Data Scientist | Developer | Embedded System Engineer | Zindi Ambassador | Omdena Kano Lead | Youth Opportunities Ambassador | CTO YandyTech