The first time reading "Trust" by Francis Fukuyama, I was deeply impressed by his distinction that Chinese trust is implied, such as inner circle of immediate family, friends, extended through marriage... while the Western version is explicit, for example, the credit score system.

    Now we have data

    With China's economy boom and in particular the ever thirsty interest by the mass of any kind of financial leverage, the only barrier everyone agrees upon that is holding the lenders back to lend more and borrowers back to borrow more, is the lack of a personal credit score. How wonderful the world is going to be if only everyone has a credit score! even better, if that score is tracked and publicized and supervised and distilled by the almighty government, isn't it just a paradise on earth?

    Is it!?

    The concept is very enticing. This score, like what we have in the US, will hinder you from doing bad things, breaking laws, defaulting loans, dodging tax, and even being on welfare for too long — it is a force to stop your sin and encourage your virtue. You are rewarded for your good, and punished for you bad. Common sense.

    But China, what a magical land, just possess this infinite creativity to turn anything upside down. If 取其精华,去其糟粕 is a wisdom of how to learn new things, in practice the "smart" Chinese is doing the exact opposite.

    When I grew up in the 80s and early 90s, the main stream was filled with news and articles and classroom materials that were nearly 100% qualitative with little quantitative analysis to back up their statements. Legend from the dark age of the 60s and 70s was that data was manufactured on demand to serve a purpose, often political, therefore reflected little reality. Trained as an engineer and a believer in science, I found it ridiculous and felt sad about ppl who had to live with those lies.

    So things got really better in the past 10 years. Everywhere you turn, there is data, charts, Excel, application... and then the buzz — big data, data this, data that... if you can not include the word "data" in your sentence, you are just out, out of fashion, out of mind, and out of a trustworthy circle. What a turn! But that's exactly the part that annoys me.

    When you speak of data, when everything is data-driven, when any argument is now backed by some table, curve, statistics, when no data is too big and everyone thinks data holds the magic to solve any problem, do they know that this is the golden age of going digital, but the worse time for trust!? I read an article today that Alibaba installed a Sasame credit as a hope to lead the way to introduce credit score in everyone's life. However, you can pay some total stranger $70 to get a perfect score using his/her manufactured data (maybe someone else's data, who knows).

    Data integrity

    This isn't even ironic. This is fucked up. Most common reaction I believe is that things will be different if government is running this instead of a private commercial entity. And I'll argue that not only it will not make a difference, but it will be a terrible idea. Who is supervising the government!? Why is it trustworthy, at all!? What happens when it installs some BS like this social credit score with probably all the noble intention, but in the end became a lawless tool in government's hand to crush whoever it feels like!? Tell me, how to prevent that from happening besides praying for its conscience!? Come on.

    But my point isn't about politics at all. The root cause of this is not some bad apples who want to outsmart the system to make profit. It will always happen. It's human nature. The problem, is people fail to understand data itself means nothing, absolutely nothing, if it loses integrity. The credit score itself is the icing; the way to collect, process, cross-check to verify and validate these data, is the core that creates "trust".

    When clients want to build a system, the only principle he needs to understand is garbage in, garbage out. System is not smart. Data is not smart. Data generator, whoever that is, is smart. The power of a system is not at accumulating data; it's the ability to define and guard data integrity. Period.

    How to achieve that? Defining integrity is the job of human. It is domain specific and context sensitive. The analogy is that user agrees 1 plus 1 equals 2. Then data's job is make sure everywhere there is 1 plus 1, it flags it green if it equals 2, and flags red otherwise.

    Trust, but verify

    How to guard integrity? Cross checking. I love President Reagan's trust, but verify. How do you know your data is correct? Looking at them in multiple dimensions see if the result is consistent. Remember, garbage in, garbage out. This is literally a mental race between the system implementor and data manipulators. The cheapest, and also the most effective, way is to feed you garbage at the first place, and that's what's happening to this Seseame score.

    So my complaint is that people pick up an idea or a buzz word without looking further to its assumption, and rush to market it as if it's another silver bullet. The assumption is really what matters. For credit score, or any other data driven application/idea/concept, the data integrity is the assumption. It takes tremendous insight into the subject at hand, and a lot of disciplines to verify. Then and only then, it can create trust.

    — by Feng Xia


    Make passport photo in Gimp

    Once a while I need to make a 2x2 for official document such as passport. It's annoying if I need to go CVS paying $15 for a digital while myself owns a whole bunch of...

    China's Opportunity

    Found this on Reddit. China and all your patriotic supporters, don't take this view. It's annoying, and will not make you...

    Cup of Joe