How to integrate wordcloud to your Django web application

Anozie Baron Chibuikem
4 min readJan 23, 2022

In this article, I will be showing you how to add wordcloud into your Django application and display it dynamically on the browser.

First, what is a wordcloud? simply put, it’s a visual representation of words that give greater prominence to words that appear more frequently.
With that said, let’s start building.

Let’s Creating our project

- creating our virtual environment and activating it
> python -m venv myenv
> source myenv/bin/activate
- creating our django project
> pip install django
> django-admin startproject wordcloudbackend
> django-admin startapp textvisualization
At this point you should have a structure like this├── manage.py
└── textvisualization
└── wordcloudbackend
└── db.sqlite3

Create a requirements.txt file in the root directory and add the following

django==4.0.1
wordcloud==1.8.1
matplotlib==3.5.1
pandas==1.3.5
openpyxl==3.0.9
mypy==0.910
isort==4.3.21
django-stubs==1.8.0
django-stubs-ext==0.2.0

Let’s register our new application. Go to wordcloudbackend folder, click on the settings.py file and add the following in your INSTALLED_APPS

INSTALLED_APPS = [
'textvisualization',
]

Next, go to the urls.py file inside the wordcloudbackend and add the following

from django.urls import path, includeurlpatterns = [
path('', include('textvisualization.urls'))
]

inside the textvisualization folder, create a file called utils.py and add the following

import base64
import io
import os
import urllib.parse
from typing import Any, List, Optional, Union
import matplotlib.pyplot as plt
import pandas as pd
from django.core.files.uploadedfile import UploadedFile
from PIL import Image
from wordcloud import STOPWORDS, WordCloud
stopwords = set(STOPWORDS)def show_wordcloud(data: Optional[Union[List[str], str]]) -> Optional[Image.Image]:
"""Convert matplotlib data to image."""
try:
wordcloud = WordCloud(
background_color="white",max_words=200,max_font_size=40,
scale=3,random_state=0,stopwords=stopwords)
wordcloud.generate(str(data))
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
image = io.BytesIO()
plt.savefig(image, format="png")
image.seek(0)
string = base64.b64encode(image.read())
image_64 = "data:image/png;base64," + urllib.parse.quote_plus(string)
return image_64
except ValueError:
return None
def check_file_type(file: Union[UploadedFile, Any]) -> str:
"""Check the extension of the file."""
extension = os.path.splitext(file.name)[1]
return extension
def read_file_by_file_extension(file: Union[UploadedFile, Any]) -> Optional[pd.DataFrame]:
"""Read the content of the file if it's .xlsx or .csv ."""
file_type = check_file_type(file)
read_file: Optional[pd.DataFrame] = None
if file_type == ".xlsx":
read_file = pd.read_excel(file)
elif file_type == ".csv":
read_file = pd.read_csv(file)
else:
return None
return read_file

show_cloud() function is used to convert our data into wordcloud and render it as an image and at the end, we return the image

check_file_type() function is used to get the extension of a file. Since we plan on reading our data from excel instead of the database, then this function will be important to determine the type of file a user uploads

read_file_by_file_extension() function is used to read the content of the uploaded file (either CSV or Excel) after checking the file type.

Inside the textvisualization folder, add the following in the views.py file

from typing import Any, Optional
import pandas as pd
from django.http.request import HttpRequest
from django.http.response import HttpResponse
from django.shortcuts import render
from django.views import View
from textvisualization.forms import DataForm
from textvisualization.utils import read_file_by_file_extension, show_wordcloud
class WordCloudView(View):
template_name = "textvisualization/wordcloudvisualization.html"
def get_context_data(self) -> dict[str, Any]:
"""For storing our context."""
context: dict[str, Any] = {}
context["DataForm"] = DataForm()
return context
def narration_chart_data(self, request: HttpRequest, data:
Optional[pd.DataFrame]) -> HttpResponse:
"""For displaying wordcloud."""
context = self.get_context_data()
wordcloud = show_wordcloud(data)
context["wordcloud"] = wordcloud
return render(request, self.template_name, context)
def get(self, request: HttpRequest) -> HttpResponse:
return render(request,self.template_name,
self.get_context_data())
def post(self, request: HttpRequest) -> HttpResponse:
context = self.get_context_data()
form = DataForm(request.POST, request.FILES)
values: list[str] = []
if form.is_valid():
user_file = form.cleaned_data["file"]
read_file = read_file_by_file_extension(user_file)
if read_file is not None:
for _, row in read_file.iterrows():
values.append(row["narration"])
converted_to_string = " ".join(values)
return self.narration_chart_data(request,
converted_to_string)
else:
form = DataForm()
return render(request, self.template_name, context)

With that done, create a urls.py and forms.py files inside textvisualisation folder.
Add the following inside the urls.py

from django.urls import path
from textvisualization import views
urlpatterns = [
path("", views.WordCloudView.as_view(), name="wordcloud"),
]

Add the following inside the forms.py

import os
from django.core.exceptions import ValidationError
from django.forms import FileField, Form
class DataForm(Form):
file = FileField()
def clean_file(self) -> str:
"""upload file."""
data = self.cleaned_data["file"]
extension = os.path.splitext(data.name)[1]
valid_extensions:list[str] = [".xlsx", ".csv"]
if extension not in valid_extensions:
raise ValidationError("File type not supported")
return data

Create a folder named templates inside textvisualization folder, inside the templates folder you just created, create another folder name textvisualization and inside the textvisualization folder, create a file wordcloudvisualization.html and add the following inside that file

<div>
<h1>Hi welcome to data visualization</h1>
<form method="post" enctype="multipart/form-data">
{{DataForm.file}}
{% csrf_token %}
<button>Upload Document</button>
</form>
{% if wordcloud is not None %}
<img src="{{ wordcloud }}">
{% endif %}
</div>

Here is a sample of the Excel file we are using for testing our application

id  narration                                   amount     
1 Lorem ipsum dolor sit amet 10000
2 vero eos et accusamus et iusto od 30000
3 iusto odio dignissimos ducimus qui 200000
4 Pelumi wordcloud is not None Osande 90200
5 Jessica dignissimos ducimus olande 5000
6 iusto odio welcome to data ducimus qui 200000
7 Pelumi Osande dignissimos ducimus 90200
8 ipsum dolor sit ducimus eos olande 5000
9 beatae vitae dicta sunt explicabo. 70000

Now you can start your server using

python manage.py runserver

Go to http://127.0.0.1:8000 and upload an excel sheet with the sample above, you will get a wordcloud result that looks like this.

And with that, we have been able to add a wordcloud to our Django project.

It’s important to mention that although we are reading our data from an Excel file and just focused on extracting the data from the “narration” column, you could upload the data directly to your database and fetch the data from the database.

Thanks for reading this article, I will see you on another cool topic next time. Till then,

Stay Safe..! Keep Learning..!

Thank you for reading!

Follow me on Medium for the latest updates

--

--

Anozie Baron Chibuikem

A backend engineer constantly building a shit ton of things with python and javascript and occasionally writes on technical topics.