Thursday, December 31, 2020

Adieu 2020

One of the most unprecedented year of recent human history, a year in which we saw antitheticals,  opposites and naysaying coming to reality. Whatever was supposed to be taboo to a good social life, every thing turned to reality. The established customs, cultural patterns, etiquettes went for toss. Strangely for many people, the year passed by faster than any other year. Possibly because people do not have many memories or outings and even the newsitems were limited and repetitive. Consequently the year felt to have passed so quickly. 

The year brought a change in my mental setup. I became less empathic and more routine. Life seemed to gain stability and routine. I became less satisfied and little bit greedy. The desire to achieve, earn money, status, fame tend to rise again. This was mentally unsettling. The state of peace enjoyed when you have less desires and high stability seemed to get disbalaned due to rise in expectations. The workohalic me returned to its self state and sometimes mind became work paralysed. It is a state when you have enormous pending work but you do  little because of lack of clear prioritization and confusing ambitions. Overall the mental restlessness increased in the second half of the year which was somewhat disturbing to realise. It is like you finally throw away some baggage after struggling so hard for years but then social pressure, personal desires and ambitions, career compulsions and quest for future push you down the same track again.

So for year 2021, my utmost effort would be on staying effortlessly and enjoying the life peacefully.

Technologies for next decade

In December, Wiki brand launched the prospective 30 technologies which will define the course of the next decade. Quite predictably, some major ones were Artificial Intelligence, IoT, Blockchain, Quantum computing, Mobile Computing, Automation, Proximity Tech and Edge computing. On the face of it, the list looks complete and futuristic. However, slight reflection on the list tends to provide a different insight.

The list looked heavily west centric. It seemed to ignore the issues of the eastern developing and underdeveloped world. The key problem of the developing world is not robots or automation, rather their key problems are poverty, hunger, sanitation, service delivery, and others. And the challenge is western solutions do not simply apply to developing world problems primarily for two reasons. First, either the first world has never faced those problems like the issue of crowd management, or secondly, the scales were very minor and limited. Hence the list of decade-defining technology could be slightly different for the developing world. Some of these are listed below:

1.      Sanitation technology: India has a huge scope for sanitation technology research. Just imagine a smart dustbin where you dump your waste and obtain a credit for free wi-fi or Just-in-time waste pickup and processing pipeline. Or imagine small scale waste processing machines installed in the backyard which can generate energy and extract metal/minerals.

2.     Artificial Intelligence: A real-time face recognition enabled services or image matching based services can play a phenomenal role in regulating traffic, crowd control and bringing discipline in public life. AI/ML-based governance framework would be more judicious and scientific in resource allocation.

3.   Public audit systems: 24*7 live real-time monitoring of public offices and public projects can help in keeping watch over the performance of public offices and set vigil over the progress of public projects. The online detailing of account expenses through a user-friendly interface can make the whole of India participant to the entire process.   

4.    Green technologies: Cheap solutions to Solar panels, smart metering, water harvesting system, piped gas line, piped waterline (clean water tech), energy efficient housing technology, rooftop/backyard landscape design using tree plantation, Electric vehicle, the efficient public transport system is need of the hour for our nation.

5.  Mobility based digital solutions: School, Dispensary, Public offices, Post offices, Courts, Police Stations, Toilets, Cinemaghars, Election booths, etc. could be installed on customized mobile vehicles like Car, Bus, Trucks rather than brick and mortar solutions. The mobile office could be taken to the doorsteps of a common man rather than a common man searching for them. 

The list is just a placeholder sample list. It can be detailed out and an equivalent list of 30 technologies could be identified which could cater to the needs of countries like India in a better fashion. However, there are a few which were not on the original list and which deserve special mention. These would be

1. Brain, Cognitive sciences, Cognitive neuroscience

2. Space sciences, cheap rockets, and discovery of remote planets

3. Faster public transport systems, like flying cars, electrical vehicles etc.

Anyway, the new decade will begin tomorrow. Let us see what does it have for us in the store.

Unsupervised learning

 In Supervised machine learning, input data is labeled with the correct answer. The input data is split into two parts namely training data and test data. However, imagine a situation when the data is not labeled with a correct answer. Suppose you are given the Income Tax Returns of huge numbers of users and you wish to find the anomalous return filers or outliers. Or let say you wish to map the users into different clusters as per their demographic attributes. In such a situation, data scientists opt for Unsupervised machine learning.  

Unsupervised machine learning (UML) helps in discovering hidden patterns in the data. For instance, let’s assume you have images of a set of animals. UML model can help you in partitioning and clustering different animals in different buckets without being explicitly trained about their attributes and features. Let us understand how this magic happens.

Whenever a huge amount of unlabeled data is passed to the model. The model finds certain common features in the data. On basis of these common features, all the data points are clustered, partitioned, and categorized into different sets. For example, suppose you have 1000 points with X and Y coordinates and you wish to divide them into two clusters. In a primitive setup, one can take the following 5 steps:

1.  Initialize two hypothetical (X, Y) tuples m1 and m2 as the centroid of two empty buckets C1, C2.

2.  For each point, find its distance from the two centroids.

3.  If the distance to C1 is less than the distance to C2 then put the point in bucket C1, else put the point in bucket C2.

4.  Recalculate the mean points of the two buckets C1 and C2 and take them as new centroids m1 and m2.

5.  Repeat step 2-3 till the time values in the buckets are not stabilized.

In the end, you will have all the points in the data space categorized in the two buckets C1 and C2.

A fine reader can spot the devils in the details in the above five steps. There are four devils sitting in these five steps.

1.  How do you decide the value of K? The number of clusters/buckets could be 2, or 3, or even N.

2.  How do you define the distance function to calculate distance between data points in a complex scenario?

3.  When do you say that C1 and C2 are stabilized? How do you calculate loss or distortion which has to be minimized for accurate estimation of stabilization?

4.  Who is going to validate the results?

One should be careful in applying unsupervised machine learning to a problem space due to the following challenges:

1.  As the data is unlabeled, one can never be sure about the accuracy or precision of the outcome.

2.  There is no definitive outcome. For instance, if you use the K value as 2, you get two clusters. If you use the k value as 5, you can get five clusters. Every algorithm listed above has its own share of pitfalls.

3.  It requires a better domain knowledge (expert) so that one should be able to see the outcome and validate the results of unsupervised machine learning.

Despite the above challenges, Unsupervised machine learning has proved to be pathbreaking and revolutionary in the field of Data sciences due to its widespread application. It can be useful in many scenarios.

1.  Clustering: It is useful in clustering the data into different clusters. For instance, looking at the ITR profiles of people, one can cluster taxpayers into obedient and non-obedient taxpayers.

2.  Anomaly detection: It can identify outliers in the data. Any anomalous input can be spotted using this. For instance, any unusual hike in refund claims or exemption claims can be spotted. Or it could be used to detect signals from an alien planet or intrusion detection in the network

3.  Association mining: It can be used to identify a set of items that occur together in the dataset. For example, if people have a tendency of buying bread-butter more frequently than bread-spices so a shopkeeper can decide whether to put bread with butter or with spices.

4.  Dimensionality-reduction: If a data point contains 1000s of attribute then it is difficult to analyze and study it. Dimensionality reduction helps in reducing the dimensions of analysis. For instance, out of 1000 attributes/features, 10 important features could be drawn and studied.

Some of the common algorithms/ models for achieving these objectives are listed below.

Clustering: A common approach for clustering is K-MEANS clustering. The 5-steps process explained above is an example of a K-means clustering algorithm where the value of K is 2. Other prominent approaches are DBSCAN and OPTICS.

Anomaly detection: Isolation forest, Local outlier factors are commonly used for discovering anomalous patterns.

Association mining: Apriori algorithm, and FP-growth algorithm is prominently used in mining patterns from the unlabeled data.

Dimensionality reduction: Principal component analysis and Singular value decomposition are the most commonly used dimensionality reduction approaches.

In recent years, unsupervised learning has been used vigorously in neural networks. This has revolutionized the research and application of Machine Learning and Artificial Intelligence. However, this requires a dedicated article on its own. 

Concept of 1-2-3-4-5-6

 Recently I was conversing with a serial entrepreneur friend. I asked him about the key motivation behind living such a zealot life. He explained that the key theme behind entrepreneurship in life is "Never settle". But it is easier said than done because most people fall into the trap of  1-2-3-4-5-6. For most people, life is all about one wife, two kids, three BHK flats, a four-wheel vehicle, 5 figure EMI and 6 figure monthly salary. He did not wish to get trapped in this pattern hence he chooses to be an entrepreneur.  From what I understand from his nature, Entrepreneurship comes from the key desire of achievement, recognition, and drive to excel. Overall, it was a very interesting conversation. I discussed many ideas with him. Let's see what is in the store for us. 


Sales tips

 Recently I was talking to my nephew who has made a successful career in sales. I asked him for five tips for sales. Here is his wisdom which I consider worth listing here:

1. Personal relationships: More than business, relationships matter.  Personal relationships bring business. 

2. Quality of service is important for creating relationships.

3. Follow up: Persistent follow up with the customers despite denials and persuasive approach helps in closing the deal.

 4. Salesmanship: Establishing a connection is a necessary first step for the successful delivery of content. Whenever you are pitching, always identify the key decision-maker, make a rational argument, be respectful to the client come what may but drive your presentation. Don't get into Q&A mode. Know your presentation. 

5. Customer is god: It is never the mistake of the customer. It is always the salesman's mistake for not being able to close the deal.

These tips are so pertinent to almost every walk of life. One thing which I learned from watching him as it is not the aptitude rather the attitude that decides your altitude.


Monday, November 30, 2020

Book review: Ethical Algorithm

This Book provides a layman's perspective on algorithmic decision making and social values like Privacy, Fairness, Transparency and Interpretability, Morality, and Safety. It is a highly recommended read for anyone who wishes to understand the pitfalls of Artificial Intelligence and Algorithmic decision making. The best part is that the authors have explained complex mathematical and probability concepts in a layman's language so that even a social science reader can understand it well.

Each chapter is dedicated to a specific theme. For instance, the first chapter on privacy discusses the pitfalls of privacy conversation. It explains how solutions like anonymization are incomplete. It gives examples of Arvind Narayan's Netflix research, legal pitfalls of movie database records, etc. It is interesting to know that despite anonymization, any person can be zeroed upon with 6 geolocations in the entire day. It argues about the issue of privacy vs predictive accuracy and explains the solution of using differential privacy as a regulator knob. 

Similarly, the chapter on fairness is filled with numerous examples. The book is engaging in its analysis and explanations. It has tried to bring out both sides of the arguments. W.r.t. Fairness, the book has beautifully explained the challenge of describing what is fairness and synthesize the entire discussion in form of pareto curve and pareto frontier. The definition of fairness as equality of false positives or euqality of false negatives is best explained using this. In this, author also highlights that science can provide the trade offs between different definitions of fairness but ultimate decision would hinge on human wisdom.

One chapter was dedicated to algorithmic game theoretic where authors explain issues in prisoner dilemma. It explains how technological solutions like Waze and Google Maps are promoting self interest over social welfare or competitive equillibrium. It also highlights other issues like eco chammberedness in machine learning and recommendation engines. Further suggestions like Cooperation through correlation are given as example.

Book also provided insight into email scams and how adaptability and scale is being used to fool people about market prediction. The concept of p-hack and garden of forking paths highlights the issue of false scientific research in the community.

Overall the book is an interesting read with umpteen examples, simple lucid language and open ended perspectives. A must read for folks who wish to know about various pitfalls of algorithmic processing and AI-ML.




Paradoxical life

 If I have ambition, how can I be relaxed?

If I am relaxed, how can I fulfill my ambitions?

If I am cool, how can I  be disciplined?

If I am disciplined, how can I be cool?

How can I be passionate without being focused?

How can I be focused without being passionate?

How can I be consistent without being goal-oriented?

How can I be goal-oriented without being consistent?


Courage to continue

UPSC is touted as one of the toughest exams in the world. Nearly 10 Lakh people appear for preliminary exam and hardly 15000 make it to Mains. Out of which some 2500 appear in Interview and finally 1000 of them become a civil servant and out of this 100 are IAS. Most of the rest rewrite the exam. So effectively out of 10 Lakh, nearly 9,99,900 either give up or rewrite the exam. So a failure in UPSC is nothing but a routine thing. It should not depress anyone. One should not feel any lesser human or less talented or depressed or any way inferior.

Success is not final. Failure is not fatal. It is the courage to continue which counts. If you think you can then you can else you are right. The harder you work the luckier you get.

We have enough examples surrounding us. Sridhar, Abhishek, Nitesh, Ravi, Amit Gemawat, umpteen number of them. Many people flunked prelims 3 times and became IPS in 4th attempt. And you have examples where people appeared to interview 4 times and could not become IAS. people who appeared 3 times but could not clear the exam. So moral of the story: Result is not in your hand. So then what is in my hand? Effort. Persistence effort. Determined attitude to give my best. Make every minute or the second count.

The question is why? No exam which makes you impactful. Money, fun, contribution to nation-building, every job offers all this. What makes Civil Services special is you can actually do some productive work. If a lady comes to your office, your one direction can get her justice. Your one surprise visit to the health center can help in the immediate improvement of services for thousands of people. At such a tender age, you can lead the entire district 10-20 lakh population. Your every walk and talk can become a source of inspiration for hundreds and thousands of others.

If you have this conviction, then no failure can stop you. Rarely  20 people become IAS on the first attempt. Rest all are in the same boat as yours. so don't worry. Take it as an opportunity to take lead for next year's examination.

I designed this survey for students to trigger self-introspection


Civil Service Exam is a test of (Choose one which you think is most important)
What is the most desirable quality in an aspirant? (Choose one which is most important in your view)
What is the foremost quality of a Civil Servant? (Choose one which is most prominent. we know it is a mix)
Why did I falter this year?
After prelims failure
For the next 7 months
Will you be writing mains at some point?
How much prepared I am for Mains subject?
How much prepared I am for Mains GS?
How many days I have wasted doing nothing after the prelims result?
What are some crucial traits helpful in winning a competitive race?
What do I need to do to clear next year's exam?
How to make the best out of the remaining time? (Mark only those which apply to your plan)
When I look at Laxmikant and Spectrum, I feel like
Who is an ideal study partner?
Study partner (choose most appropriate)
How frequently do I check social media? (Choose the closest)
Social Media (Insta, Facebook, WhatsApp, TG)
Best way to control social media
The biggest obstacle in my preparation
Why am I preparing for civil services? (Choose one which is most prominent. Do not go by the social desirability factor.)
Any comments