The tamil words in x-axis are not printed correctly.
Code used :
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
font = fm.FontProperties(fname=‘Nirmala.ttf’) # specify font
mostcommon_small = FreqDist(allwords).most_common(25)
with open(“output.csv”, “w”) as f:
x, y = zip(*mostcommon_small)
plt.xlabel(‘Words’ , fontproperties=font)
plt.ylabel(‘Frequency of Words’, fontsize=50)
plt.title(‘Frequency of 25 Most Common Words’, fontsize=60)
Unfortunately I think this is a limitation of our text layout engine which does not handle ligatures correctly.
If I understand correctly சொ is made up of ச followed by ஒ (I do not speak or read Tamil, I’m just looking at wikipedia, I apologize if I am making mistakes) which then are rendered as a ligature that puts the vowel to the left even though it is second in the string.
To fix this in Matplotlib we would need to replace our layout engine (switch to using
raqm is probably the right path).
In the short term mpl-cairo (GitHub - matplotlib/mplcairo: A (new) cairo backend for Matplotlib.) does use
raqm and I think will correctly render your text.
Thank you so much @tacaswell . mpl-cairo + raqm worked beautifully .