Takeaways from Nepal (1)

I'm back.

I sit at the foot of my bed and sigh. I've been wearing the same hiking pants over the last month. My mind naturally wonders about how much of my sweat they absorbed over the 200 miles trekk... I gladly push the grimey thoughts out of my head. Instead I focus on the present. My ass is pressed against a soft mattress. Something I haven't felt in some time. I focus my senses. 

Then it all comes back. 

The fear from when I was dropped off of a local bus in Dumre. The sting from a bloody blister. The mountainous rampart playing strip tease with the fog after a rainstorm. A familiar face of another trekker after a period of solitude. The sunrise over Annapurna-2.

After everything that happened, am I still the same person? How much do you need to change to become "new"? I remember taking a course my freshman year, Philosophy 3 - nature of the mind, that extensively covered this topic. Damn, I should've paid more attention back then - it's a phrase I say more frequently now than ever.

I spent a great deal of time on the trekk pondering over different topics. I try to summarize some of my learnings here.

Problems/Fears (and onions)

At the start of the journey (specifically after I was dropped off at Terminal G in SFO), I felt my knee buckle under the weight of the formidable unknown. I envisioned a physical black hole resting on my shoulder, sucking away at my spirit. I was scared for obvious reasons. I didn't speak the language. I don't know how I will navigate. I've never even backpacked in the States for more than a day, yet I have to trek for 3 weeks abroad. I don't know how I was going to react to the altitude. My doctor told me you'll feel the altitude at over 10,000 ft. I gulp at the fact that I have a 17,700 ft titan to conquer. Did I bring enough antibiotics? What about Diamox (for the altitude)? Chlorine dioxide water treatment? Is my pack too heavy? Will my startup's website stay up? Are there bears? What about the Yeti?

Soon my brain is paralyzed. The neurons are fired up like they're trying to achieve nuclear fusion, and there's no bandwidth left for me to form a rational thought. I sit in silence, exhaling harder with each breath.

Just then a lady announces the airport security protocol over the loudspeaker. Something about reporting unattended baggages.

That's right. I'm currently just sitting at the terminal. None of my fears/problems are happening now. I realize that though most of my fears are valid (I admit that a few are blatantly irrational), majority of them will not occur to me at a single moment. After all I can only experience life in a single slice of time. This means that at any given moment, there may be problem(s) for me to tackle, but they'll always be a subset of the collection of total problems (and probabilistically a small subset). Yet I always focus and feel pressured by the collective unknown; the black hole. But is it a black hole? I know what problems it might consist of. I have an idea on when the problems might occur, and how much they'll trouble me. So in reality, it's more like a discrete timeline of layers of problems. Suddenly the blackhole is looking like an onion.

Let's put it this way. It's unlikely that I'll have to swallow a giant raw onion. But rather, I'll peel each layer (and cry in the process, get it?), and tackle the individual problems. Now I can't instantaneously defeat the collective problems, i.e. the whole onion, but I don't have to; we can only experience moments in slices, since our sensory input is bound by the flow of time, which is linear. This means we can't live in multiple moments at once. So look up from your screen, and describe to yourself what you see. This is the current moment you're experiencing (and a discrete moment can last just a few millisecond to a whole millennium). And this current moment and each subsequent moment will contain a set of problems; but you'll always only have to deal with a single moment's worth of problems.

Equipped with this powerful thought, I start thinking about my immediate problems. Well frankly they've started boarding the plane. It would be problematic if I missed it. My immediate focus is to get my ass to the counter, hand over my ticket to be scanned, and board the plane. It's done, and it only took me 8 minutes. The next moment I'm seated in the plane. I start wondering how I'll get a taxi in a foreign country late at night (since my hotel transportation never responded). What about the $1,500 USD cash that the lack of reliable ATM in Nepal is forcing me to bring? If I get lost, will the map on my phone work? I won't have a valid sim card; where will I get a sim card? Then I stop my train of thoughts; I'm crossing the moment boundary. The only thing I should (and need to) focus on is how I'm going to spend the next 16 hours with my in-flight entertainment headphone jack malfunctioning. Sleep, I thought. And I did.

I'll talk about some of my other thoughts/discoveries in the next post.


Poking Around Facebook Data

Shit, it's already 2015.

Faced with having to graduate from college in 2014 (albeit I'm happy about no longer having to pay tuition), a lot has changed. Many of my friends have ventured down the industry route by joining tech companies. As a result, a large portion of my buddies moved out of the college town into tech centers such as San Francisco and Palo Alto.

Combining that fact with my startup work hours, I can tell you that my social life hit a lifetime low. I was so busy with work that I didn't even realize my lease was ending. As I scrambled to look for a new apartment, I called up an old friend to crash on his couch while I was homeless. I can honestly say that the 'startup homelessness' is overly romanticized and that the overall experience was miserable.

"In June 2010, I moved out of my apartment and I have been mostly homeless ever since" - Brian Chesky, CEO Co-Founder Airbnb


I've known my friend (let's call him Andre) since the start of college, and have roomed with him for two years. But as I was about to make that call, I realized that I haven't spoken to him in quite a while. The haunting realization that my relationships were dwindling came as I tallied up the number of friends I still kept in touch with (other than my co-founders).

Fast-forward a few weeks. I'm reading up on activity trackers (think Fitbit and Jawbone) and what they can do with my data. I learn that they can figure out when I'm sleeping (or if I'm sleeping less), with just the sensor data from a wristwatch. As a Statistics and Computer Science grad, the creative usage of data made me ecstatic. I've known that my data was used in places like advertisements, but this is the first time that data was directly used to improve my life. Then I thought, how come all of these giant corporations are using my data for me, instead of me using it for myself?


I spent some time thinking about what data I can use.

So I thought, why not my Facebook data? I'm not a big Twitter user, and I haven't quite caught up with Instagram or Pinterest. Facebook would encompass all of my online social media data. A quick google search revealed that you can in fact download your entire Facebook presence in one archive. (free tutorial on how here: Tutorial for the Analysis)

I went on Facebook and asked for my data dump.

They were careful with my data and asked me to authenticate one additional time (this in turn made me wonder if anyone else had access to this data). After a few minutes of waiting, I received a dump with a (unzipped) size of 49,500,160 bytes (47MB or about ten mp3 songs). That's my entire Internet presence packed up into almost 400 million zeroes and ones. Unsurprisingly, close to 32MB out of the 47MB are media (photos/videos) and 15MB are other user-generated data. Digging deeper, I found that of the 15MB, over 14 MB consisted of my messaging data.

100,000.

That's how many back and forth messages I had over the last 5 years. That's already 50+/day! But was it 50+/day consistently? Or did I chat hundreds of messages one day and remained relatively quiet the next? Moreover, to whom was I talking to? I know I wasn't too active on Facebook early on (I was one of the last waves of MySpace users #tomismyfriend), and I haven't been too active more recently.

To dig deeper, I opened up the message file and saw that the data was encoded in HTML. This makes it easier to explore via my browser, but slightly harder to systematically analyze. Upon inspecting the HTML dump that encoded my messages, I decided to parse the HTML via BeautifulSoup.

Top communicated friends.

Unsurprisingly, I recognized every single one of the top 5. But something was wrong; I haven't talked to some of these people in a quite a long time (some over a year). But what does this mean? Is this simply because of my overall decrease in Facebook use, or does this actually signal for my systematically deteriorating personal relationships? Or perhaps these are just some exceptions?

To clarify the matter, I decided to build a time-series plot of the raw weekly message count for the top friends. To keep things consistent (since I also processed group messages), here are the rules I used:

  • I keep track of [to] and [from] counter for each person I interact with.
    • This means that for each person I've communicated with, I keep a unique counter that represents the number of messages I sent [to] that person, and also the number of message I received [from] that same person.
  • For each person sending a message within a group, I increment the [from] for that person
  • For each message I send, I increment every single group member's [to] count
  • Every message thread with more than 4 participants was ignored (certain group messages contained all event invites, or classes etc.).

Finally I used pandas to wrangle with the data, and matplotlib to plot it (names are removed below).

Yup, that's kind of hard to read.

The chart shows that the communication was dominated by one person, which has significantly faded in more recent times.

To make it easier to read, I instead plot the top 4 friends (combined [to] and [from]). As an extra measure for clarity, I rank them each week (taking advantage of scipy.stats.rankdata()). For a last bit of added fancy-ness, I incorporate plot.ly to create and deploy an interactive graph into the cloud.


Some Observations

The rise and fall of Lisa and James.

That was the spring of 2013 leading into summer of 2013. I had quit my job in January of that year and was feeling pretty burnt out. There are reportsreferring to losing and gaining friends, this seems like a good example. Though I was pretty aware of what happened, it is still punishing to see it in person. You don't always get to see something so intangible in a such a quantified manner.

Matt remaining consistent.

Matt is someone I've known for a long time. We're no longer the closest friends, but we talk frequently and consistently (don't know why, but we do). I became close to Matt in late high school, but we've consistently stayed friends. What surprised me was my own lack of awareness about the fact that Matt indeed has been a consistently good friend. We're up to date on what's happening in our lives; I'm probably going to give him a call now, maybe let him know how our Demo Day went.

Emergence of Harris.

I met Harris via a common interest near the end of college. We hang out quite a bit now, so this is accurate. He left the country 2 days ago (as of writing this sentence) to pursue his startup abroad. This discovery will serve as a good reminder for me to keep in touch with him.

Take Aways

Data is insanely powerful.

But I already knew that. Or did I? As an individual experiencing life one data point at a time, I don't get to see the seemingly insignificant changes that added up to a life-changing trend. I thought I was aware of my own personal data (weight over time, GPA over time, mile-run-time over time, bench-max over time, bank balance over time, you name it). Turns out I was missing the most important one: my relationships over time. The hard thing about relationship is that I didn't know of any quantifiable ways to analyze it, until now. It's no coincidence that the simple analysis performed on a single dataset revealed meaningfully insights.

Limits of data.

What? Limits? Yes limits. A quick poke at the data revealed a lot about my social history, but it's still limited to Facebook. This doesn't include my texts/phone calls or real life interactions. This explains why my co-founders (and long time friends) aren't on the top list, despite the fact that I spend over 80% of my awake and functional time with those guys.

Journal?

I now regret that I hadn't kept a journal to analyze. I'm confident that if we can figure out what is happening (and might happen) in the stock market via sentiment analysis, perhaps we can leverage journal entries to figure out the complex thing that is human relationship. Maybe it'll predict when I'll be depressed / stressed (I'm willing to bet the word "fundraising" is going to be closely tied with my stress level), and help me be more aware of my personal well being.

Last Thoughts

All I did was build a plot that ranked my messaging behavior. Yet it revealed quite a lot of insights about my own social behavior. With more data, I can see ways of building models that can reliably predict ups and downs in my personal life. Given how caught up we are with our lives and careers, maybe we all need something that reminds us to keep in touch with our human friends. Perhaps that'll my next startup idea.

Tools Used

  • BeautifulSoup is an easy to use python package that helps you parse HTML.
  • Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming.
  • Matplotlib is a python 2D plotting library which produces publication quality figures.
  • Scipy is an open source library of Scientific Tools.
  • Plot.ly is an online analytics and data visualization tool.