< Back

Testing and Text Extraction

November 18, 2020
Book Plugged into Computer

It’s my 19th week at Atorus and a lot has changed over the last 8 weeks. Since my last blog I have spent time testing, fixing bugs and developing new features.

Automated Testing

I have continued to run through the automated tests using Katalon Studio with the latest versions of GluIQ. The latest version now includes: a file-locking system, marking projects as confidential, skipping to Module 3 (Define It) and module 4 (Plan It) file creation screens. For each new feature new tests must be written. I found writing the tests themselves relatively straight forward after developing my skills with Katalon Studio in the previous weeks of automated testing. After running the tests any new bugs identified needed to be fixed.

It’s easy to see why automated testing is so useful, as although it took over an hour to complete one test run, this would be a lot quicker than running the tests manually, which at the start of my placement took several days to complete.

Text Extraction

More recently, I have been working on extracting text from documents for further Natural Language Processing (NLP). This text extraction will need to be done using python in a Django view in order to save the results to the database for further processing. I found that text extraction techniques appeared relatively simple to implement, requiring only a few lines of code, extracting text from Word Documents and PowerPoints easily. The same cannot be said for images and pdf documents which will require Optical Character Recognition to extract the text.

Education

Alongside developing GluIQ, Erland and I also began an online introductory course in predictive analytics through the University of Edinburgh to enhance our understanding of Python and to gain an insight into data analytics and AI. I found this useful to refresh my knowledge of Python and expand my understanding of the Python libraries including NumPy, Pandas and Scikit, some of which I might be able to apply within GluIQ soon.

Want the latest from GluIQ? Follow us in Twitter or Linkedin