Menjual Applikasi Open Source

Applikasi / library open source umumnya menyertakan lisensi agar jelas penggunaannya dan tidak penggunaan tersebut tidak melanggar hukum. Umumnya lisensi diletakkan pada file LICENSE di folder utama project. Ada beberapa lisensi Open Source, yang mana 2 lisensi yang paling umum digunakan (menurut saya) adalah MIT dan BSD 2. Khususnya untuk MIT license, lisensi tersebut memperbolehkan untuk menggunakan, menyebarkan, menjual bahkan me-lisensi ulang (dengan lisensi baru) dengan catatan bahwa copyright original tetap diikutsertakan.

Saya bukan pengacara, secara pribadi sebagai orang awam, itu berarti bila applikasi A sudah di-license dengan MIT license oleh X, maka bila ingin menjual applikasi A perlu diikutsertakan license yang dibuat oleh X. Apabila ingin melakukan perubahan dan menjualnya sebagai applikasi B dengan license Y, tetap perlu mengikutsertakan license yang dibuat oleh X. Hal ini menghindari pengakuan dan pencurian hasil karya seseorang.

Jadi, dapatkah saya menjual applikasi open source?

Ya, bila lisensi yang disertakan memperbolehkan menjual (MIT license sangat jelas dalam hal ini) / menggunakannya secara komersil (BSD 2 menuliskan dapat digunakan secara komersil, tapi tidak secara jelas mengatakan dapat dijual / tidak).

Bukankah perbuatan tersebut tidak etis?

Perlu diperhatikan kembali pada MIT license, terdapat klausul bahwa pengembang tidak bertanggung jawab untuk kerusakan / masalah (kira-kira begitulah) yang ditimbulkan oleh applikasi / kode. Hal ini tentu tidak dapat ditolerir oleh perusahaan-perusahaan / unit-unit bisnis komersil. Mereka perlu pihak yang dapat bertanggung jawab / menyelesaikan masalah yang terjadi pada applikasi, baru mereka mau menggunakannya.

Di sinilah kehadiran penjual applikasi diperlukan. Dengan kontrak kerja yang disetujui kedua belah pihak, tanggung jawab menjadi kewajiban dari pihak penjual applikasi (bukan pengembang utama), dan perusahaan dapat menggunakannya dengan jaminan support. Jadi tidak perlu malu / rendah hati untuk menjual applikasi open source.

Bukankah merugikan developer applikasi?

Sebagai pengembang, juga perlu memahami bahwa menyertakan lisensi tersebut memperbolehkan orang lain untuk menjual, menggunakan dan memodifikasi applikasi dengan ketentuan:

tidak menghilangkan kontribusi pengembang awal (tidak mencatut nama / mengaku pengembang)
pengembang tidak dapat diminta pertanggung jawaban dalam bentuk apapun

Dan juga perlu memahami, bahwa tanpa adanya tanggung jawab yang bisa dipegang oleh perusahaan, applikasi yang dibuat tidak akan berguna (tidak akan digunakan). Sehingga sebenarnya kehadiran sang penjual dapat membantu mem-populerkan applikasi dan menyebarkan penggunaannya.

Sebagai calon pembeli, apa yang perlu diperhatikan dari penjual open source?

Bisnis IT sangat erat dengan kepercayaan / trust. Apabila anda tidak dapat mempercayai suatu penjual, jangan melakukan perjanjian bisnis dengan mereka. Namun ada beberapa hal dari penjual yang perlu / dapat diperhatikan:

tanggung jawab adalah hal yang utama. Maka dari itu, pilihlah pihak yang memiliki tanggung jawab tinggi. Jangan sampai penjual hilang kontak 1 minggu setelah pembayaran
skill teknis tetap diperlukan. Begantung pada perjanjian kerja, di saat tertentu penjual harus dapat melakukan perubahan / perbaikan terhadap applikasi, sehingga sedikit banyak penjual harus mengerti isi dari applikasi
disaster preventation and recovery juga adalah hal yang penting. Bergantung pada perjanjian, penjual setidaknya perlu memiliki skill untuk melakukan backup dan recovery bila diperlukan

Kesimpulan

Bila ingin menjual open source application, jadilah penjual bertanggung jawab. Berikan kontribusi ke developer utama dan jangan mencatut hasil karya orang lain. Sebagai pembeli, belilah dengan didasari perjanjian bisnis. Segala sesuatu yang perlu dibicarakan / di-negosiasikan harus dituang dalam perjanjian bisnis.

Sebagai pengembang, ketahuilah lisensi open source memperbolehkan orang lain untuk menjual applikasi yang dikembangkan. Sedikit banyak akan membantu mem-populerkan applikasi.

Di akhir kata, ini bukanlah tulisan dari seorang pengacara atau orang yang mengerti hukum. Penulis tidak dapat diminta pertanggung jawaban untuk segala kerusakan, atau apapun yang terjadi karena mengikuti tulisan ini.

React hook, can it be better?

I see react hook proposal today and while it is a solution to some problems, I am concerned with how they use universal functions (useState) rather than dependency inject them, which is clearer. In this article, I am exercising my ability at designing library by trying to find another approach to those problems.

First, we like to see their point about Complex components become hard to understand at introduction.

It pools subscribed events and execute it afterwards

From the example at the following gist, I feel that the implementation idea of useEffect is just subscribing events, pool them and execute by order when render has been done. I know the implementation above is not following the best practice, but the fact that it can fully works concern me.

However, personally I don’t know how that kind of design workaround can be exploited, either in good way or bad way. But one thing for sure, the same won’t happen when declared using class declaration, or the functionality being added using HOC, meaning no exploit may be present.

useState in custom hook blur the scope

useState is imported from React, which is kinda global-level module. I’m using the following gist for this example.

I’d like to know, at which scope the useState here resides? At first, based on the design, I think it’s singleton, and globally managed. But seeing the implementation, it is scoped per useState call.

If you run the code, it’s logged 20 times, which are called per every useState call, multiplied per each Elm rendered, doubled because one on didMount and another on didUpdate (not quite correct, but it’s similar enough).

I know the scope is documented. But as I have stated before, the fact that class declaration does not have this scope issue makes this design somehow confusing.

Can it be better?

Better is usually relative to something. I prefer it to be somehow have more explicit declaration and better at clearer flow. But it’s verbose and increasing code size, which is worse in that aspect.

Reusable Logic

First, the reusable logic for component that envisioned by react team is more or less consist of those aspects:

can maintain it’s own state
can have parameters, which is independent with component’s props
can have state changed and component attached will able to watch it
state can be changed asynchronously
can have some logic at side effect phase, ex: at didMount

Functional Component

Then the functional component somehow has those aspects:

can maintain it’s own state
can use / attach many Reuseable Logic and watch it’s state
can have some logic at side effect phase, ex: at didMount

The design

This gist is the example of how react hooks declared that I envisioned. For simplicity, I use the obsolete createClassmethod to indicate component composition.

At useFriendStatus:

it accept a parameter of self-managing state
then it return function that accept parameters specified by developers
then the body is where the logic reside
finally, it’ll return an object, which contains state that can be listened, and side effect (if any)
the effect can return another function, in which will be executed on willUnmount

Then at FriendStatus or component, it has 2 parameters. The 2nd one being the specification of component:

it will specify initialState to be passed to component
it also specify which hooks to attach, by injecting props for initial state, and createState that can be used to create self-managing state for reusable logics
the hooks specification will return array of reusable logic, which already been injected with each of their own self-managing state

Then the 1st being the render part:

as usual, the normal props injected at 1st param
2nd param being the self-managing state with initialState value from specification
3rd param being the list of hook states that can be watched. It’s positionally same with hooks definition position from specification

What is the difference?

For me, my version explicitly show where is the state comes from. It’s living scope is also clear because it’s defined in hooks specification. And it’s also not using any universal-level functions except for builder-pattern-like createClass.

The hooks specification being executed once and defined in specification makes it more explicit declaration. The reusable logic being pure function also makes it easier to test, mock or reusable with other libraries. Of course some side effect and conditional hacks can also be applied, but it’s another topic that can be discussed later.

However it’s very clear that it’s worse in terms of code size, since it’s add more details to code. And it’s also reduce the number of manageable state to one per each reusable logic.

Conclusion

Tweaking a RFC concept and trying to find the flaws and better approach is a fun experience. It can also serve as exercise for me at designing API for library.

Security is (very) costly

Security is hard indeed. In this article, I’ll try to explain more about security and how it’s a costly feature that somehow the cost is not visible to stakeholders and why there are still “security insufficient” websites. We’ll start with the most common process in every websites, user login, which I have been tinkering and dug deeper lately.

Storing the password

There are many ways to store user registered password in database. The most easy one is to store in plain text, which if I discovered and if I can, I’ll find you and I will beat you up. Nowadays it is unacceptable to keep password in plain text at all, on any storage, except only sending the newly-generated reset password to email. Though it cost nothing, it is very risky and vulnerable to adapt.

Store the hashed md5 password is a higher level security than plain text. Then everytime a user login, md5 hash the input password and compare the hashed with existing one in database. md5 hash costs very little CPU power nowadays. However this practice is vulnerable to rainbow table attack.

Which is why salt, a random generated string is added to password to prevent it (I won’t explain the implementation here). salt, together with md5hashing is in my opinion the most minimal security to use to store password. Though because it costs very little CPU power, an attacker with strong enough computer is still possible to brute-force or rainbow table attack it.

Which is why nowadays bcrypt (or the newer argon2) hash is more preferred. I use the older and much simplerbcrypt. It has built-in salt (so you don’t need to manage it yourself) and has cost-factor so you can easily increase or decrease it’s processing cost related to your / nowadays processing power. Of course the security impact is it’s increasing processing cost, which is costly for you, but even more to the attacker.

Login session and cookies

After password storing which costs processing power, we now need to handle login session, cookies and the infamous “remember me” feature. I find it more complicated than just storing the token in database and store it in cookies and pass it for every requests. See this stackoverflow answer and this referenced article for implementation detail about it (it’s too long and out of topic here).

In short, it is very complicated and costs some resources like memory (redis), bandwidth (for passing cookies) and processing power (to verify login session for every request). Fortunately, those costs are negligible nowadays and good framework has handled most of those process so you only need little configuration. So it is another reason to use framework.

Some other security costs

OS / libraries / apps security patches costs some processing power and (possible) memory. Most recent example are performance drop for spectre / meltdown CPU patch.
Antivirus really costs processing power.
Two factor authentication costs infrastructure, network bandwidth and casual user experience.
Captcha costs network bandwidth and user experience.
Asymmetric crypthography need some setup before using, and not effective to use for fast, anywhere access.
Sql injection, xss, csrf, mitm attack prevention need good developer experience and infrastructure knowledge, as well as some bandwidth and processing power.

The upside and the worst cost

The upside are there are so many info, books and sources to learn about security, how to develop and use some security measurement and where / when to use it. Moreover the bandwidth / processing power / memory costs for majority of security measurement are negligible. At worst, maybe around 10% hardware performance upgrade (not counting business growth) every 5 years are required to satisfy security measurement.

However the worst cost that comes with handling security are developer times. The times are used to research for some security risks, how / what is the best way to handle it and how to really implement those to developed products. QA will be the next party which time will be used to test those implementation. Those times are the worst costs in implementing security measurement.

Bottom line

Security is needed, even though it’s costly, especially at developer and QA times. Though not all required, there are minimal security measurement needed to ensure that at least the system will be safe. Fortunately there are many libraries / framework that already handle those measurement for you, that it can cut times to implement it and tested.

Mutability: Array

Mutability in programming is the ability of an object to have it’s state / value changed. Immutable object, on the other hand, cannot have it’s state changed. In php, javascript, C# and Java, most of variables / objects that is commonly used are mutable. Array is one of them.

Array mutability

Let’s see the following snippet of code:

let exec1 = function() {
    console.log("START: EXEC 1");
    let a = [3, 5, 7];
    let wrongOperation = function(arr){
        arr.push(8);
        arr[1] = 4;

        return arr;
    };

    let b = wrongOperation(a);
    console.log("b:");
    b[2] = 6; // mutation
    console.log(b); // results [ 3, 4, 6, 8]

    console.log("a:");
    console.log(a); // results [ 3, 4, 6, 8]
    console.log("DONE: EXEC 1");
    console.log();
};

We can see that any modification to b is changing the value of a, which sometimes is expected, and the other times are becoming a bug. If the result b are returned and modified by another caller. Sometimes in the future, you may wonder why the content of a changed. It will be hard to track where the changes happen to a.

A better design

A better design is indeed, to prevent any changes made to b to be reflected back to it’s original owner, a. This can be achieved by replicating the array argument using concat in javascript, or array_merge in php to an empty array. See the example in following snippet:


let exec2 = function() {
    console.log("START: EXEC 2");
    let a = [3, 5, 7];
    let correctOperation = function(arr){
        let result = [].concat(arr);
        result.push(8);
        result[1] = 4;

        return result;
    };

    let b = correctOperation(a);
    console.log("b:");
    console.log(b); // results [ 3, 4, 7, 8]

    console.log("a:");
    console.log(a); // results [ 3, 5, 7]
    console.log("DONE: EXEC 2");
    console.log();
};

Above example shows how the operation copy the argument array first before doing any operation using concat with another empty array. It cause any further modification after that function to not reflect back to original variable a.

Another example can be like this:


let exec3 = function() {
    console.log("START: EXEC 3");
    let a = [3, 5, 7];
    let anotherCorrectOperation = function(arr){
        let newData = [];
        for(let i = 2; i < 4; i++){
            newData.push(i);
        }
        return arr.concat(newData);
    };

    let b = anotherCorrectOperation(a);
    console.log("b:");
    b[1] = 4; // test mutability
    console.log(b); // results [ 3, 4, 7, 2, 3]

    console.log("a:");
    console.log(a); // results [ 3, 5, 7]
    console.log("DONE: EXEC 3");
    console.log();
};

The above example do operations first, then returning the operation result together with the existing array. This is the preferred approach to the other for-push directly to argument array.

It’s still just the array that be copied

However, both preferred example above only copied and un-ref the array and array only. The content is still the same, and can be modified. For example, if the array contains javascript objects, any modification to the array member will be reflected back to original variable a:

let exec4 = function() {
    console.log("START: EXEC 4");
    let a = [{v: 1}, {v: 3}];
    let anotherCorrectOperation = function(arr){
        let newData = [];
        for(let i = 2; i < 4; i++){
            newData.push({v: i});
        }
        return arr.concat(newData);
    };

    let b = anotherCorrectOperation(a);
    console.log("b:");
    b[1].v = 4; // test mutability
    console.log(b); // results [ { v: 1 }, { v: 4 }, { v: 2 }, { v: 3 } ]

    console.log("a:");
    console.log(a); // results [ { v: 1 }, { v: 4 } ]
    console.log("DONE: EXEC 4");
    console.log();
};

So you still need to be careful at doing variable modification inside functions. However at least you need not to worry anymore when doing modification to the array.

Conclusion

Design your operation to be immutable and returning copy by default, unless the other behavior is somewhat desired. This can help to make code easier to track, modular and prevent unnecessary bug in the future. All code that is shown in this article can be retrieved at my github repository.

Why should you give buffer to project estimation

“Why is this enhancement needs 1 month? Can’t it be done with just 3 weeks?”

This is often being said by project managers to developers. They like to squeeze estimations as short as possible, then in the end wonder why the project is going late. Overtime happens, code quality reduced, bugs everywhere during development until launching. This is one factor that I, as a developer, see as an “incapable” project manager. Furthermore, client often has the same vision: to reduce the estimation as short as possible. Little did they know that having buffer to estimation can bring many benefits rather than shorter one.

Less overtime

Overtime is unproductive. It has been researched many times, and you can easily looking for those in internet or forums. It is obvious, having more buffer to estimation will lead to less overtime needed, in which overall will prevent productivity dropped in long time.

More often than not, I find that unjustified overtime won’t bring better result in a week / month time span, compared to working normally. That’s because usually the task done in overtime is either wrong, bad quality, or incorrect by requirement, in which usually they will be fixed in next week / month.

On the other hand, justified overtime is a necessity, like when there is a critical bug in production, or when a bug is found at last time before launching.

No spare time for changes

Client is very bad at giving requirement. They don’t know what solution they want. They maybe even don’t know what problem they are facing. In my experience, it is 100% chance that at least one small requirement change and at least 50% chance of one medium / big requirement change will happen during project timeline.

Up to this day, prototyping is very useful, especially in programming world. Usually prototype are made to give client a clear picture of how the program will work and how it will help solve their problem. Requirement change are highly possible to happen at this point.

Setting up tight estimation will be a disaster during this time. With tight estimation, any change can’t be integrated in timeline, since everything has already calculated. This can only lead to bad things: late deadline, non-applicable requirement or bad quality.

No room for mistake

Mistake happen, there is no single software in this world that is bug-free. Even the business flow, a requirement that is given by client, may be incorrect. Setting up tight estimation without calculating for mistake and fixing is planning to fail. More buffer to your estimation means the application can be tested more, and more problems can be fixed during the meantime.

You will never deliver “earlier”

There are two situations possible when delivering, that is to deliver “early” or deliver “late”. Deliver “on time” is almost impossible. Usually the project are already done and ready to be delivered at the specific time, so it counts as “early”.

Now which one is better: “longer estimation but deliver ‘early'” or “shorter estimation but deliver ‘late'”. It’s up for personal preference, but usually I find that deliver “early” bring better impression than “late”. “Early” deliver usually being perceived as faster than the opposite, a physiological things.

Now with setting up tight estimation you are setting yourself to deliver “late” and never “early”.

It is indeed hard at “client” side

Client is hard to deal with. More often than not they aren’t satisfied with estimation given, no matter how tight that estimation is. Though they want tighter estimation, they cannot give clear requirement. “Just make it so that we input this, that and that and it’s saved! How hard is it?” is usually their saying, without knowing what must be done if it needs to be edited if mistakenly inputted, what if there are changed, which system are affected by the change, etc.

That’s why good client will bring good software as well. If you want to teach a client why a tight estimation is bad, give them a tight estimation. Then everytime there are changes and everytime discussion with client about requirement happens, count for all of that. Then after delivery, make an evaluation session with client and present them how all of those things delay the estimation and make the project late.

The delivery time will be the same after all

Many times I find that tight estimation will be late, and the delivery time is usually the same with when I giving buffer to estimation originally. Inexperienced developer and manager usually prefer to give tighter estimation and underestimating how big time will changes and fixing take. The problem is, how much buffer do they need?

In previous job, I like to add 30-50% buffer to my estimation. Then my PM will try to bargain by cutting 20-30%, then give back some buffer to QA and fixing phase. In the end I assume it’s around 25-40% buffer. With that, I usually deliver 10-20% early, so it means 20-40% is a sweet spot, based on how complex and big the project is. It’s just my preference and personal experience, do not take it as guidance, since everyone is estimating differently.

Now if it’s the same after all, why not try to give longer estimation and more flexibility in requirements and development? It will provide better foundation and code quality in software after all.

Summary

Give buffer to your estimation. It’s benefit is far outweight the false perception that “shorter estimation is faster”. You won’t be flexible or will having good quality if the estimation is tight.

PHP Nested Ternary Operator Order

Now let’s say we have the following pseudo-code that we want to implement in several programming language:

bool condition = true;
string text = condition == true ? "A" : condition == false ? "B" : "C";
print text;

Here is the implementation in Javascript:

let st = true;
let text = st === true ? "A": st === false ? "B" : "C";
document.write(text);

Here is the implementation in PHP:

$st = true; 
$text = $st === true ? "A": $st === false ? "B" : "C"; 
echo $text;

As a bonus, I also try the same pseudocode in C#:

bool st = true;
string text = (st == true) ? "A" : (st == false) ? "B" : "C";
Console.WriteLine(text);

Now let’s guess what is the result of variable text? The expected value should be A, which is already correct in other languages, but PHP produces B. Well, this is not a great discovery but many PHP developers may be missing this after all, so I think it’s worth archiving. Parentheses may fix them but it’s making things ugly, looks like it’s better to stick with if-else statement then.

Security is hard

The recent issue about meltdown and spectre attack shows how hard a security implementation is. For a short explanation, those two attacks takes advantage of CPU’s error handling to gain access and read other non-authorized memory address. A patch has been published by each respective vendor and OS right after. However the real issue is the applied patch can bring down the performance up to 30%! And this is what I want to raise in this article.

Trade-off

Ignoring programmers efforts or development cost, a security implementation may or may not has a trade-off, but it’s more likely to has a trade-off rather than not.

Let’s take for example a security token for online banking. It’s a security implementation that reduce UX (user experience) by adding one step of verification. Though in this case the trade-off is worth it, that it helps the user to verify the input and prevent wrong transaction that otherwise can be too easy.

Asking user for username password everytime to login is also a UX trade-off, in which lately there is other option by “login with facebook”, “login with twitter” and so on. And in majority of trade-off, such as in latest meltdown case, is performance drop due to another step of verification.

Trade-off vs Risks

Security flaw after all, are just risks. It’s only when an attack being executed that the security flaw is a loss for one. Usually security flaw only bring negligible trade-off (performance drop) that it’s better to implement than not. Some example, preventing sql injection, xss, one-way hash salted password, using HTTPS is a common practice. They should be enforced because otherwise it’ll be too easy for the attacker to exploit the flaw and getting advantage of it.

However in case of up to 30% performance drop in latest case, how complex and how much precondition there is for a successful meltdown attack, the performance drop to risk rate can be considered high. In this case, there is an “advantage” to not fix the security flaw, and simply hoping for the attacker to either not targeting you, do not attempt with specific attack method, or simply doesn’t interested enough that they don’t want to waste with their time.

However, the risks will always be there and the attacker may be have better and better tools to exploit the flaw, while at the same time we can hope for better and better fix with lower trade-off to exists. After all, it’ll be top level management and developers that may decide whether it’ll be better to patch it right away or leave it as is.

After all, security is hard.

Debugging / learning is a scientific cycle

This is a little shower-thought idea I’ve got a while ago, that by debugging, a less or more you are actually doing a scientific cycle. Though it’s simpler than actual scientific cycle. A full simple scientific cycle can consist of: do observation, make theory / hypotheses, perform experiment, perform observation based on that experiment, repeat. I know, actual scientific process cycle and debugging is more complicated than that, but the general idea for their full flow is that cycle.

Do observation

When you encounter a bug or abnormality in the process, the very first thing that you need to do is observe the abnormality. You will need to observe some (or all) of those things:

what happened during the specific process
what was the process doing
what is affected
what is the current result
what is the desired result
where is the difference
what is written in the log
is there any more information that you can gather?

The observation process will lead to next step, to make hypotheses.

Making hypotheses

You will craft some hypotheses from all of the information gathered from observation process before. Some of the hypotheses can be:

it only occurs at requests with specific value at specific field
only when user do those specific steps
when the machine is having high load
when there is issue in internet connection
when the machine’s recommended specification is not met
when some of the apps is outdated
and so on…

If there are insufficient information acquired from previous process, the worst hypotheses available can be: that bug will happen if you perform the same step, with the same data, at the system with same configuration, maybe needed to be done at specific time. No matter what your hypotheses are, the best experiment to perform next is to reproduce the issue.

Perform experiment reproduce the issue

This is one of the hardest steps of debugging, creating an environment that can reproduce the issue consistently. Many issue can be hardware specific, concurrent / race condition specific, network issue specific, hardware failure specific, and many other complex situation can produce the issue. But this hard effort can provide you with big rewards, such as that you will understand the process more, it will be easier to decide the cause and you will be ensured that the fix is really solving the issue.

After you can reproduce the issue consistently, you can do the next step by placing more logging features, setup debugger tools and then continue with observation.

Do observation, make hypotheses and experiment again

With more information and the ability to reproduce the issue, you can repeatedly perform the cycle. Observation produce information, it will be used to make hypotheses, you make fix based on the hypotheses, observe whether the fix is really solving the problem, make another hypotheses if the problem is still there, perform another fix, repeat.

At the last iterations, you may observe that the change to application has fixed the problem. Then you will start to make theories (a hypotheses that is supported by facts from tests), then do more experiment to prove the theories. For example, you can change the application back to re-produce the same error with different condition, or that you can do same steps with different data to ensure that the fix is correct. If you theories is proven by more tests, then the debugging process is completed.

Unit test

Now we can see that the debugging process above is very complex and time-consuming. Especially when you need to re-create the environment to reproduce the error every time. Unit test is an amazing tools in this case.

With unit test, you can do experiment with more isolated environment, can easily tinker the data with mock object, replicate the situation or configuration (set the time to specific value maybe) and many more to reproduce the issue. Once the issue has been reproduced, the test will result in fail / error.

Then the fix that you made will be tested again until it produce the correct expectation, and other existing unit tests can help to ensure that it won’t make error in other place. Amazing, right?

Conclusion

Debugging is more or less similar with how you will perform scientific experiment / research. It’s a repetitive cycles of observation, hypotheses and experiment. Unit testing can help the process greatly since it can create an enclosed environment in which you can perform experiments with artificial conditions.

Why PHP is bad and why it isn’t

Nowadays programmers consider PHP as a very bad programming language. One of the example is in this comic strip, saving the princess with programming language. But why is PHP considered bad and why does it is very popular out there?

The good

In general, PHP is a good language to start learning programming with.

It’s easy to setup and start

PHP is very easy to setup, especially for beginner. Just use XAMPP (for windows) and LAMP (for linux), and drop the code in htdocs and everything will go well. Just search in google for “hello world php xampp” or “hello world php lamp” and you’re good to go.

Furthermore it’s one of the easiest language to setup shared hosting, making it very easy to make your own website.

It’s very forgivable

PHP is dynamic typing, meaning you don’t need to specify whether an variable is string, int, specific class, etc. And it’s string concatenation is different with numeric additional, making it less ambiguous than javascript’s dynamic and don’t need type conversion. It’s very easy for beginner to start with.

And PHP variables works very well with HTML. Almost all native variables can be printed to screen by using echo, while array and object need special treatment.

Furthermore, using undefined variable only resulting in notice, and can be easily suppressed. But beware, both are considered “bad habit” in programming, so take it as learning features. There are also more exceptions that usually result in error in other language, that can easily suppressed in PHP.

It’s both procedural and OOP

PHP can serve procedural code, and OOP one. It’s very common to start learning programming with procedural, and learning OOP next, and it’s easier in same language.

Furthermore, PHP is a C-like syntax programming language, and there are many good languages in C-like syntax, like Java, C# and javascript. It’s C-like syntax is better than python (which is also a good starting language) “if” you aim to move later to those language.

Frameworks and tutorials are abundant

With many framework and tutorials out there, someone can search any problem or topics that they currently worked at, and finding many pages of google results. It’s very easy to find answers to PHP problems nowadays.

Furthermore, many PHP framework are using MVC (Model View Controller) pattern, which is one of the most common pattern in web programming. Learning them can help transition to other good languages using MVC pattern, such as Java MVC spring, C# Asp.Net MVC, NodeJs MVC frameworks and many more.

Furthermore nowadays PHP has composer, which is good to handle library as packages, which is almost all new languages use. And PHP has many CMS which make creating webpage like wordpress CMS easy.

The bad

So why is PHP considered bad? Well you need to at least good in programming to know it’s limitation and bad side.

It is not strong, static typed

PHP starts as dynamic, weakly typing language, helping to customize HTML pages ages ago. Up to this day, it still support dynamic typing, while supporting some type hinting at arguments and property level. While dynamic typing is good to start learning programming, it’s not good at complex business process.

However, being interpret language means the type hinting can only trigger when executed. So we won’t get any type error up until the portion of code is executed, as opposed to Java/C# where it can be caught compile time.

Moreover, PHP7, even after getting scalar type hinting for string and int, still not having generics for array. Without any means to type checking array, it’s harder to do type checking and enforce reliability, especially in business process (accounting).

It doesn’t have multithreading options by native

Without using additional components “PThreads”, PHP doesn’t have any options to emulate multithreading. It isn’t that PHP cannot do multithreading, however the problem lies in how “PThreads” works. It copy the current “process” state (loaded classes, etc) into another process and execute them concurrently.

In my experience with PThreads for PHP 5.6, (maybe I just lack configuration, correct me if so) PThreads use bigger memory than other programming languages, notably C#, Java and NodeJs. Moreover it’s harder to catch exception and to debug process spawned by threads.

So it doesn’t support multi-core process

In case of heavy background process or batch processing, most of the time multi-core support is a requirement.

It doesn’t have memory-persistence cache

PHP is run-and-forget scripting language, which load all it’s needed reference class on beginning of request (and during execution for lazy loading one), and to flush them later. The process takes time, and while PHP7 doing JIT to cache some of it’s code, it’s still not efficient because they need to be loaded for every request.

In contrast with PHP’s scripting, NodeJs and C# Asp.Net MVC (haven’t use java, but should be similar) run a server, and keeping the loaded classes (scripts) in memory, making them more efficient.

It’s dynamic typing takes too much memory

Looks like it’s mitigated in PHP 7, however in PHP 5.6 below, the dynamic variable in PHP takes too much memory. It’ll soon be a hassle when working with big variables, big file or many records of data.

And even if PHP7 is more efficient, it still can’t beat C/C++ level of memory usage per variable. And arguably, so do as in comparison with static typed language, such as Java and C# and the currently rising golang.

It’s data access doesn’t support multiple-result sets

Apply for MySql at least (looks like it supported in PostgreSQL). PHP cannot return multiple tables in one query. Let’s say that you have one procedure that returns 3 select queries, PHP MySql driver can only return one.

Many of it’s library support is configured at installation level

Some of the native library for PHP is configured during installation (gcc make and phpize). Some of the examples are zip (–enable-zip), thread safety (–enable-zts) for pthreads. It makes binding configuration to app repository level harder and reduce portability.

In conclusion

PHP is a good language to start programming with, easy to setup and have many libraries / framework / CMS. However in case of advanced use by expert programmers, PHP doesn’t really meet up the requirement.

Conference experience: #tiajkt2017

First and foremost, thank you for Tech in Asia for free #tiajkt2017 conference invitations to my workplace. What an amazing conference. Now let me share some of my experience attending the conference.

It’s so full of visitors

The conference is big, the hall is also big, however there are full of visitors everywhere. Not only from startups, bootstrap alleys, there are also full of academy, investor and general guests everywhere. However I feel bad for bootstrap alley’s long registration queue at the morning, hope it’ll be better next time.

Notes from developer stage

Today I came mainly for attending developer stage. As a programmer myself, the seminar materials are very useful for me. For those who did not attended the conference / stage, here are some notes that I think are important to be known:

Artificial Intelligence are raising in popularity

Artificial Intelligence, with the more specific Machine Learning as subset are gaining popularity in IT industry. Their usage are vast, and will be very useful to (but not limited to):

Targeting advertising and marketing
Personalization
Cut the repetitive workload such as managerial approval and document filling checking
Recognizion (face / image, sound, text)

My speculation are that the popularity of big data in recent years enabled companies to do further data mining and lastly the development of artificial intelligence. It’ll be popular and mainstream in next 5-10 years, so make sure to invest your skill in there (me too, need to learn it asap).

Mobile is the king

Unless the startup’s field is in SAAS for administrative B2B systems, their applications will be mobile device oriented. Well then it’s not a wonder that there are rising in demand for ionic, android, angular and react programmers. It’s beneficial for you to learn any of those programming features. And for startup founders, please aware that this is the era of mobile device, and consider to hire those (at least one) who excel at mobile programming.

Metrics are important

For companies and especially for startups, it’s very important to have metrics. Without them, you won’t know how good your progress, and how you measure your lately performance. For startups in specifics, keep dreaming to getting good investors without good metrics.

And make sure your metrics are correct, and aligned with your company’s value. Measuring number of instagram with tag posted won’t be useful for e-commerce startup.

Clouds are rising higher

With the rising in big data and artificial intelligence, the needs for higher spec hardware and workload fluctuation are appearing. Cloud is one of the solutions, that they offer performance scaling in specific time (100% increase in ram for next 24 hours for example). No wonder they are more popular now.

Startup types

I see there are more startups that works in recruitment area. Maybe that’s being inspired with how hard it is to find good programmers. I’m wondering whether they will provide solution (since in reality, good programmers are really scarce). E-commerce startups aren’t many, and most of them are b2b. That’s good. Some ad-hoc service providers startups like plumbers are common too. And uniquely, there are some that trying to work in agricultural area, and one in AI for chatting and socmed marketing. Keep up the good work!

So…

It was an amazing experience. It’s unfortunate that there won’t be any programmers topic at 2nd day though. Maybe TIA can also add IT recruitment segment in their next conference? Lastly I wanna say thank you, and congratulations to Tech in Asia for hosting such amazing conference!