I hate the guys who created Python 3, now I want Java again

There has been several months of difficult time and I haven’t blogged. I did some development but there is no point talking about it because I want to start from scratch with Java.

I hate Python 3 and everything that come with it

I am the average Joe developer and certainly I am not those who lead technology advance. I hate the guys who created Python 3 and decided it should be incompatible with everything of Python 2 because it leads to so many buys and difficulties in development. They said we should ditch Python 2 but most Linux distribution come preinstalled with Python 2 as “the Python”. Also python has a Pip which also comes with Pip 2 and Pip 3, with Pip 2 being the default one.

Now I just want to write program, do development instead of messing with system setting issue. Today I wasted at least one hour configuring PyCharm Professional with virtualenv. I don’t know what is the problem and I don’t want to waste time on it.

It could be the package “mecab-python3” used some default setting of Python 2 and crashed? I don’t know. I have already configured the system with default Python 3 and Pip 3 but it still happens.

I installed PyCharm Professional ( Trial ) because I don’t want to fiddle with command line anymore

During all the time with this “project”, I code with command line “Flask” and notepad++. But I fed up with it because every time I make a code change, I have to kill the running Flask server with Ctrl+Z and run “Python app.py” again. Ideally, I wish there is an IDE environment which behave like Eclipse of Java. Eclipse support Tomcat and I could bundle Spring project with it and the configuration is so simple. I just press the green arrow button “Run”. Coding, Testing in the same windows.

Now this command line won’t even work

Because I haven’t touch the code for a while, somehow it didn’t work as before. Did I make a fresh install of my PC? I don’t remember it. It should have work as expected but it does not work.

There should be a way to do development so that you close the IDE today, so it still works 100 days later after you click the funny icon which launch the IDE.

I made a simple program and at least 15 actual person said thank you

It’s been a while since the last post because of life and my project has been on a pause. But this post is not about the project but a simple program I made for my own good.

I am learning German language and one of the greatest difficulties is that German verb has conjugation ( verbs changes its’ form under different tenses ) and I think it would be very helpful to have a full list of all six conjugations of some common verbs. I have a list of 100 most important verbs which I got from internet ( there are plenty of those list ) and an online conjugator service ( http://conjugator.reverso.net/ ). You could enter the infinitive form of a verb in the website and it returns all forms of it.

Screenshot of Reverso Conjugation

However, I don’t want to lookup every words because that require 100 and more lookups. I asked about alternative to it online at Reddit subreddit /r/German but people suggest Duolingo and Memrise, which I don’t like. I much prefer a full list in an excel format so I could browse at it whenever I need. But obviously the website would not provide this kind of list, because otherwise no one would need their service anymore once we get the list.

In short, I made the program and ran it with 100 most important German verbs and use Selenium to look up the conjugations for me on Reverso Conjugation. I post the result excel on reddit, and I received 233 upvotes ( Greatest amount of upvote I have received in Reddit ) and received 15 actual “Thank you” in comment section.

Screenshot of my post at Reddit

To be honest, I know I am not the brilliant kind of developer, and probably are mediocre but at least I know how to program and those program are much more than simple Hello World. I usually refuse to work on project unless I am motivated ( either by money, aka in Office, or I need that program ). I am keen on automation, scripting, web scraping. Full stack programming is not my strength and I don’t familiar with the latest technology, but I produce VALUE.

The excel I made with a Java program. It is beautiful and very useful.

More on the Java program it self. It is a normal java program with the help of Selenium and JSoup. I have to use Selenium because the website Reverso Conjugation is not a static website and it generate its content dynamically with javascript. So I have to use Selenium to pretend that an actual person use a real browser to lookup thing, and copy the source code of generated webpage. After I get the source code, I break it apart with JSoup and look for what I need. I know for each tense, the content are being stored under an html tag with class “.blue-box-wrap”, so I take everything from there.

“Chrome is being controlled by automated test software.”That’s it for today!

Project progress – something I don’t like to do

 

I have been holding off development of my web application because I have to tinker with security between frontend and backend.

When I add certificate, it gives me difficulties even in testing environment and I don’t like to mess with openssl certification and stuff. In a production environment you would have to buy a real certification which costs a yearly fee. I could make a self-signed certificate in my testing environment but it would make big scary warning even in chrome.

I am working on it, the first thing I have to make the warning in Postman disappear and let me conduct testing on my REST api again.

Project process – Things got into shape, and I finally know to appreciate REST APIs

Language Translator from IBM Cloud

Language Translator

The project has finally gotten into shape! Last time I wrote that I wanted to include a translation by google translate in my web application, it turned out Google charge quite a bit for the Google Translate API, so I turned for other free service and found Language Translator from IBM Cloud. It allows free translator of 1,000,000 characters per month and is enough for development purpose.

latest screenshot of the web application

Product Layout

Now this web application is divided into four sections ( excluding the button ):

Left-Top section: Source text provided by user

Middle Top section: A tokenized version of the source text by part-of-speech. Access by a REST API written with Flask.

Rightmost section: A list of words translated by a free Japanese-English dictionary JMdict.

Lower section: A large textarea which hold the translation result by IBM Cloud Language Translator.

Product Functions

  • Tokenize Japanese text according to its part-of-speech, which are estimate by MeCab
  • If mouse over a tokenized word, the tooltip would show part-of-speech
  • Provide dictionary result of the words
  • Provide real translation by translation service

 

Now I am not going to make it prettier just for the sake of it. I have to ready the production environment and how to ship it. Also I have to wrap my REST API with HTTPs and make sure no one could access it except my server.

To be honest, the Language Translator API sucks

IBM Cloud provided four ways to access its Language Translator, they are: curl, Node, Java and Python. Ideally I would have chosen the Python way to access it. However, it requires Visual C++ Framework 14.0 and I would have to download Gigabytes of data just to install the Visual Studio Community 2017 which contains Visual C++ compiler. To be honest, I hate Microsoft stuff with passion because of the horrible experience with coding Visual Basic.

Since it is a Python project, I don’t want to mix it with Java or NodeJS. The last option is an ugly Curl API like this:

curl --user apikey:{apikey_value} --request POST --header "Content-Type: application/json" --data "{\"text\": [\"Hello, world!\", \"How are you?\"], \"model_id\":\"en-es\"}" https://gateway.watsonplatform.net/language-translator/api/v3/translate?version=2018-05-01

It has to use slashes ( ‘\’ ) to escape the double quotes ( ” ) in order to send a JSON over curl data. And in order to use this in a Python Program, not only do I have to Use Python Subprocess to call external program “curl”, I have to escape the slashes one more time with a double slash ( \\\ )

“curl –user apikey:XXXXXXXXXXXXXXXXXXXXXXXXXXXX –request POST –header \”Content-Type: application/json; charset=UTF-8\” –data \”{\\\”text\\\”:[\\\”” + text +”\\\”],\\\”model_id\\\”:\\\”ja-en\\\”}\” \”https://gateway-syd.watsonplatform.net/language-translator/api/v3/translate?version=2018-05-01\””

 

Man this is ugly, but it comes with free service. Now I finally understand why REST API is a thing. You don’t need to install anything, you just send a request over HTTP and it magically return something to you!

Project progress – Added access to dictionary and new problem

Screenshot at the prototype

What were done:

  • I have added usage of JMdict in my website. JMDict is an opensource Japanese-English dictionary.
  • Functionality to tokenize Japanese text, copy the text and paste the result to the original “right” box.
  • Added an extra column to the far-right which shows some words definitions, with help of JMdict.

What I want to do:

  •  Make the dictionary translation more accurate
  • Add an extra area down there which provide a translation result using Google Translate API.

Problem:

Since I am not familiar with frontend/html/css stuff, I don’t know how to fit an extra box in that area because that area was occupied already although it looks empty. The column in the far right is in the same flex group with the other two box, since it need extra height to display stuff, the whole flex box stretched and occupied the empty space below.

 

 

Project progress – How to synchronize two development machines

Sample screenshot, the result has a space inserted between words

 

Screenshot of the Backend API when it was called.

I have some progress on the project but I have delayed because of 1) busy work, 2) The incessant rain in July and August, 3) I don’t want to manually synchronize two copies on my notebook and my home machine, which are on different OS.

I figured out how to use React and put some UI on the web page. Now there is a button which would call a HTTP REST API , take the input from left textarea, that put the result to the right textarea.

Problem: Perform development on two separate machine ( Windows and Linux ), without the need to manually copy and compile program

Thoughts:
– [Done] First, put Frontend code on dropbox, because it’s just html/js with library of React, semantic UI
– [Problem solved] There might be problem with backend because it’s development used Windows python/virtualenv, Pip, and Flask

– [Done] I am going to put it on github and sync it with git push and git pull
– Also I want to set a Git Hook which trigger a git pull from my other machine. Since it involve embedding a username password, I would do it later.

Project progress and something I like/hate of ReactJS

Actually I have finished the backend API for a while. Instead of using Django, I chose Flask because I don’t need the model in Django. There are so much steps you need to do a simple thing in Django. With Flask, I write an api and it’s done. Of course Django is more mature and has more features but I don’t need them now.

The problem is frontend because I am not a frontend developer, to say it precisely, I don’t do much html/css/javascript stuff. Anyway, I don’t want to use AngularJS ( which have experience with ) because it’s a full blown framework with its own MVC paradigm. I just want a simple web page which show two textbox and a button. So I chose React and had to go through the tutorial.

For some reason, I kinda hate it. I like it for the component idea and I don’t have clutter up my html with jsp/php whatever, I just put everything inside a .js file. However, during the tutorial, I found something which made me nausea.

Function and ECMA6 class

First of all, it’s the choice between using a function or a ECMA6 class to describe a component. In the official documentation , there are many examples of function AND ECMA6 class appear together and both of them are Component of React. For instance:

function BoilingVerdict(props) {
  if (props.celsius >= 100) {
    return <p>The water would boil.</p>;
  }
  return <p>The water would not boil.</p>;
}
class Calculator extends React.Component {
  constructor(props) {
    super(props);
    this.handleChange = this.handleChange.bind(this);
    this.state = {temperature: ''};
  }

  handleChange(e) {
    this.setState({temperature: e.target.value});
  }

  render() {
    const temperature = this.state.temperature;
    return (
      <fieldset>
        <legend>Enter temperature in Celsius:</legend>
        <input
          value={temperature}
          onChange={this.handleChange} />
        <BoilingVerdict
          celsius={parseFloat(temperature)} />
      </fieldset>
    );
  }
}

It make sense with the class way to make a Component. It doesn’t make much sense for me with the Functional way. In my paradigm, I believe there should be only ONE way to do the same thing. It is confusing to see a function there but it’s not a real function. Yes it does return something but it serve the purpose like a Class. I think they should change all of them to Class instead of using function for Component.

States and Props

Second, there are difference between state and props. It’s confusing in the example because they introduce state first with an working example, and later told me to put it away and replace it with props. In additon, they were used in the same way: “this.state.someThing” and “this.props.someThing” and I think it’s confusing for beginner. I assume it’s like fields of super class in Java but it has extra work to pass the state of super class to subclass. We could do it cleaner with Java:

package me.rayentwickler.main;

public class TestSuperClass {
private String generalPropertyA;

public TestSuperClass(String generalPropertyA){
this.generalPropertyA = generalPropertyA;
}
public String getGeneralPropertyA() {
return generalPropertyA;
}

public void setGeneralPropertyA(String generalPropertyA) {
this.generalPropertyA = generalPropertyA;
}

}

package me.rayentwickler.main;

public class TestSubClass extends TestSuperClass {

    public TestSubClass(String generalPropertyA){
super(generalPropertyA);
}
public static void main(String[] args){
TestSubClass myInstance = new TestSubClass(“ABC”);
System.out.println(myInstance.getGeneralPropertyA());
}
}

In Java, if two class have ancestor/child relationship, I don’t have to tell Java that what kind of fields does the ancestor have. But it ReactJS, it seems I have to do a lot of boilerplate code!

 

Props are Read-Only, but we could change it by setState()

React said it provide a one-way data flow to make the world safer, so props are read-only. However, because the mother component still to access all the states of its children, we need to “Lift the state up” so that every state are stored in the top most Component and the state are passed down to children like a waterfall ( The real nature waterfall, not SDLC ). However, usually we want to accept user input and change the state which is props in this case, but props are read-only! What now? Well, ReactJS provided a way to call upper level component’s function, and in the function it called setState()! Since state could be modified, and after each update, render() would be call and states are passed to children again with new values. Seriously, it sounds like a loophole in its paradigm.

Japanese Tokenizer – Design

 

I have drafted a sequence diagram for the application. Although I have hesitated between Java and Python, I have settled with Python. Because of the RESTful API for frontend, I have to consider authentication and I hate spring’s security module. I don’t like spring’s magic to begin with ( e.g: @AutoWired, @EnableWebdSecurity ), I prefer to have everything wired up by hand. It has becomes much more complicated over the tutorial on spring security. For example:

    @Override
    protected void configure(HttpSecurity http) throws Exception {
        http
            .authorizeRequests()
                .antMatchers("/", "/home").permitAll()
                .anyRequest().authenticated()
                .and()
            .formLogin()
                .loginPage("/login")
                .permitAll()
                .and()
            .logout()
                .permitAll();
    }

I don’t like chained method calls and I do not appreciate that the guide didn’t explain how does this part works. What is and(), what is the logic behind these lines? What if I don’t want to use a form login? I have an API so I don’t need a form login. The code does not look like the kind of Java I know.

So I chose Python, although I have more experience with Java. ( I have coded with Python, but not with django, I expected it’s trivial because of the small application )

I have been a backend developer for various project and seldom code frontend stuff. But if I have to make an web application, I would have to take care of frontend + backend. Please excuse me if I make horrible mistake in web design.

Frontend: A static page with javascript ( ReactJS )

Backend: Python Django

Security for the RESTful API: Http Basic authentication over HTTPS

I only have time to code it in Saturday and Sunday because I couldn’t focus after a full day work. For development, I prefer to do it on a VPS like digitalOcean because I don’t want to mess with my pc, however, it is too expensive to spin a droplet just for a day’s coding and paying for the whole month. I am looking in to Amazon AWS but I couldn’t get the ssh working. I use to code on digitalOcean and it’s much easier to connect with ssh. Actually this blog is hosted on a VPS of digitalOcean.

Let’s continue next time.

Project I have in mind – (2) Weather forecast accuracy tool

Photo by Craig Whitehead on Unsplash

I have another project in my mind, I want to obtain everyday weather forecast data from Hong Kong Observatory. My original idea was to provide a chart with what kind of clothe to wear in different temperature down to exact degree, but I have given it up because I don’t remember what do I wear in different temperature. I might have to conduct a survey for that. I might come back for it later.

My current objectives is to gather everyday weather forecast data, and cross check them later when the day passed, in order to know whether HKO is accurate.

This is an example of the forecast:

Date/Month 7/8 (Tuesday)
Wind: East force 2 to 3.
Weather: Very hot with sunny periods, a few showers and
isolated thunderstorms.
Temp Range: 28 - 33 C
R.H. Range: 65 - 95 Per Cent

I could easily classify them into member of a java class, scrape the data, and save into database.

 

I have obtained a piece of permission from HKO:

Dear Mr. Lam,

Thank you for your your submitting the electronic form of request for authorization to re-disseminate the information from Hong Kong Observatory website on 29 July 2018.

The Hong Kong Observatory hereby grant to “LAM WAI MAN” free authorization to reproduce or re-disseminate weather information of HKO website detailed in the Annex as listed below.

1. Such information is for NON-COMMERCIAL use (i.e. not for selling or exchange for benefit, gain, profit or reward in this context).

2. Acknowledgement is given to the Hong Kong Observatory as the source of information.

3. Such information must be reproduced accurately.

For the other conditions of the authorization,  please visit the link below:

http://www.hko.gov.hk/appweb/applink.htm

Note:

All web contents (including the format of HTML files, images, hyperlinks, Javascript codes and etc) of the HKO websites (including its mobile version, text-only version , RSS and etc.) are subject to change without prior notice to the authorized persons or organizations.

Notices to authorized persons shall be posted in the following webpage:

http://www.weather.gov.hk/appweb/noticeboard.htm .

Regards,

TH Chow
Hong Kong Observatory

Now I have the permission from HKO to scrape their website, I could work on it. But I prefer to do it after the Japanese Tokenizer. I would come up with a design first though.

Project I have in mind – (1) A Japanese text tokenizer online

Photo by Charles Deluvio 🇵🇭🇨🇦 on Unsplash

 

I had attended Japanese classes recently and found a piece of opensource software called MeCab ( http://taku910.github.io/mecab/ ) which was developed by Kyoto University. It is an offline software so one would have to install it on computer. I want to put it online so that it is accessible by other people without installing the software.

Here is an example of using MeCab to tokenize Japanese text:

% mecab
すもももももももものうち
すもも  名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も      助詞,係助詞,*,*,*,*,も,モ,モ
もも    名詞,一般,*,*,*,*,もも,モモ,モモ
も      助詞,係助詞,*,*,*,*,も,モ,モ
もも    名詞,一般,*,*,*,*,もも,モモ,モモ
の      助詞,連体化,*,*,*,*,の,ノ,ノ
うち    名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS

Because Japanese text does not have a space between each word, you couldn’t judge which one is a word. For example, this sentence was composed of 8 characters of も ( mo ), it is so complex for average learner and even google translate couldn’t understand it:

Actually すもももももももものうち ( sumomomomomomomomonouchi ) translate to “Both plum and peach are a kind of prunus ). From wikipedia: “Prunus is a genus of trees and shrubs, which includes the plums, cherries, peaches, nectarines, apricots, and almonds. ” The actual sentence structure is this – “すもも も もも も もも の うち” ( Plum and peach and ( is ) peach of type ).

What I want to do now is to provide this Japanese tokenizer online, in a similar UI like google translate, in which there are two textbox, you put in words in left-hand side and get the result from the right-hand side.

It would involve two components, a frontend UI powered by a web server, with Semantic UI or JQueryUI, and a backend RESTful API written by Java or Python. I still haven’t decide to code it in Java or Python yet.

Let’s continue later.