<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Python on Ray Yang, Ph.D.</title>
    <link>https://yangphd.com/tags/python/</link>
    <description>Recent content in Python on Ray Yang, Ph.D.</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Wed, 15 Apr 2020 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="https://yangphd.com/tags/python/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Mapping Twitter Topics and Sentiment about FAAMG Companies since Covid-19 Outbreak</title>
      <link>https://yangphd.com/research/mapping-covid-19-related-latent-topics-and-sentiment-on-twitter-about-faamg-the-top-tech-companies/</link>
      <pubDate>Wed, 15 Apr 2020 00:00:00 +0000</pubDate>
      
      <guid>https://yangphd.com/research/mapping-covid-19-related-latent-topics-and-sentiment-on-twitter-about-faamg-the-top-tech-companies/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#motivation&#34;&gt;1 Motivation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#procedures&#34;&gt;2 Procedures&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#data-source&#34;&gt;2.1 Data Source&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#topic-modeling&#34;&gt;2.2 Topic Modeling&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#results&#34;&gt;3 Results&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#perceptual-map-of-faamg-based-on-topic-sentiment&#34;&gt;3.1 Perceptual Map of FAAMG Based on Topic Sentiment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#keywords-co-occurrence-network-on-each-topic&#34;&gt;3.2 Keywords Co-occurrence Network on Each Topic&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;motivation&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;1 Motivation&lt;/h1&gt;
&lt;p&gt;On a whim, I mapped out what people say (the text) and feel (the sentiment) about the top tech companies (FAAMG) on twitter since the Covid-19 outbreak. The keywords and sentiment that are associated with the main topics should differ across these companies.&lt;/p&gt;
&lt;p&gt;By no means I am drawing any causal association here. But the story starts with checking the recent price movements of the FAAMG stocks. Obviously, Covid-19 has impacted these companies quite differently.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/stock_price_FAAMG.png&#34; alt=&#34;FAAMG stock price&#34;&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;procedures&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;2 Procedures&lt;/h1&gt;
&lt;div id=&#34;data-source&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;2.1 Data Source&lt;/h2&gt;
&lt;p&gt;For each company, I downloaded 1,000 tweets containing the company name as the hashtag (e.g. “#amazon” for Amazon).
I used Python module “tweepy” to stream the data and set the “since” parameter to ‘2020-03-01’, which collects a sample of 1,000 tweets since March 1st.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;topic-modeling&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;2.2 Topic Modeling&lt;/h2&gt;
&lt;p&gt;Using Python module “gensim”, I detected three most significant topics (including Covid-19, business, and daily life), which are named by the associated keywords (most relevant terms).&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/lda.png&#34; alt=&#34;latent topics and revevant terms&#34;&gt;&lt;/p&gt;
&lt;p&gt;Using Python module “vaderSentiment”, I computed sentiment scores for each tweet and aggregated the scores by company. The topic-based sentiment scores are used to construct the perceptual map, in which each company takes a different position.
Using Python module “nltk”, I find the most frequent topic-based co-occurring words for each company.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;results&#34; class=&#34;section level1&#34;&gt;
&lt;h1&gt;3 Results&lt;/h1&gt;
&lt;div id=&#34;perceptual-map-of-faamg-based-on-topic-sentiment&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;3.1 Perceptual Map of FAAMG Based on Topic Sentiment&lt;/h2&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/topic_sentiment_FAAMG.png&#34; alt=&#34;perceptual map and the positioning of each company&#34;&gt;&lt;/p&gt;
&lt;p&gt;It looks Amazon does well on all topics, especially on Covid-19-related topic. Microsoft and Apple do well on “Business”.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;keywords-co-occurrence-network-on-each-topic&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;3.2 Keywords Co-occurrence Network on Each Topic&lt;/h2&gt;
&lt;p&gt;Here are the work keywords co-occurrence patterns for each topic of these companies.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/topic_keyword_amazon.png&#34; alt=&#34;Amazon Word-Association Network&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/topic_keyword_apple.png&#34; alt=&#34;Apple Word-Association Network&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/topic_keyword_facebook.png&#34; alt=&#34;Facebook Word-Association Network&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/topic_keyword_google.png&#34; alt=&#34;Google Word-Association Network&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://yangphd.com/opinion_mining/041520_FAAMG_twitter_topic_sentiment/topic_keyword_microsoft.png&#34; alt=&#34;Microsoft Word-Association Network&#34;&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Interoperability in Tech. Evolution—Thoughts on “Reticulate&#34;</title>
      <link>https://yangphd.com/post/2019/03/11/interoperability-in-technology-evolution-thoughts-on-reticulate/</link>
      <pubDate>Mon, 11 Mar 2019 00:00:00 +0000</pubDate>
      
      <guid>https://yangphd.com/post/2019/03/11/interoperability-in-technology-evolution-thoughts-on-reticulate/</guid>
      <description>


&lt;p&gt;&lt;i&gt;“In our highly dynamic world, it’s not enough for an organization to possess a competitive advantage at a point in time; it needs an evolutionary advantage over time—a capacity to change as fast as change itself; to change before a crisis breaks.” — Gary Hamel&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;On March 6th, the R package &lt;a href=&#34;https://CRAN.R-project.org/package=reticulate&#34; target=&#34;_blank&#34;&gt;“reticulate”(1.11.1)&lt;/a&gt; was released on Cran, which allows R users to directly call Python objects or use Python in an R interface.&lt;a href=&#34;#fn1&#34; class=&#34;footnote-ref&#34; id=&#34;fnref1&#34;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt; Bloggers have been praising this development for the ground-breaking interoperability between R and Python. With “reticulate,” we no longer need to jump between different IDEs to leverage the complementarity between R and Python. This will certainly bring big ease to the programming routine of many “bilingual” data scientist using both R and Python.&lt;/p&gt;
&lt;p&gt;As an occupational habit, I tend to think about the value creation and value capture in the dynamics of industry and technology evolution. Here are some thoughts on the “reticulate” case.&lt;/p&gt;
&lt;p&gt;First of all, one the user level, “reticulate” by Rstudio will benefit those who program in both R and Python. So far, R and Python are the two most popular programming languages in data science. A decision for a newcomer in data science to make is always: “R vs. Python, which one should I choose?” Now this headache is significantly lessened. If we can overcome the “language barrier” easily, we won’t be stuck in the either-or any more.&lt;/p&gt;
&lt;p&gt;Second, on the community level, there have already been quite a few early adopters who are R-Python bilinguals even before the pre-release of “reticulate.” They choose a language depending on which one is more readily usable for a specific problem. For bilinguals, there is no transition in their head, but there are some inefficient transitions on their fingertips. Now, they will leverage the power of “interoperability” to keep their hand movements in one place–Rstudio. These users will be engaged in Rstudio more often. What will also naturally happen? (a) the total number of R-Python bilinguals will grow; (b) the two languages will become more integrated, instead of being divided. We will see fewer (or slower growth of) packages/modules which have the same name and perform the same functionality in two languages, and also less “R vs Python comparisons.”&lt;/p&gt;
&lt;p&gt;Third, thinking from the technological (and also strategic) perspective, we see the greatly increased compatibility between two programming languages. What follows compatibility will be an increase in the adoption rate on the technology who opens up its door to embrace other technologies. Rstudio is the superpower in the R domain. Such a move will draw many Python programmers to R and also attract and retain R users with the empowerment of R-Python interoperability. This “one-way” compatibility&lt;a href=&#34;#fn2&#34; class=&#34;footnote-ref&#34; id=&#34;fnref2&#34;&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; will greatly enhance the usability and desirability of the Rstudio as a platform provider.&lt;/p&gt;
&lt;p&gt;The release of “reticulate” is a strategic move of Rstudio. More data scientists will be stay engaged in Rstudio, or they will spend more fo their time on the platform. But this is not the end of the story. The interoperability will continuously enlarge the overlap between the R and Python communities if we draw a Venn diagram in our mind. The Python-centered IEDs are also powerful. They have great features that their users wouldn’t live without. They may match the move of Rstudio. More likely, the reverse “recuticate” in Python will emerge from the large open source community, by innumerable open source developers. Even if the one-way compatibility will continue, Python IEDs will not lose the game. As R programmers can easily operate Python in Rstudio, they are also more likely to travel and contribute to the Python world. This is similar to the case when Apple allows Microsoft’s Windows operating system to be installed on Mac and Amazon’s Kindle electronic book app installed on iPad. The competing platforms will both benefit from the one-way compatibility, because the enlarged overlap of the Venn diagram will bring “two-way” flows of traffic to both sides.&lt;a href=&#34;#fn3&#34; class=&#34;footnote-ref&#34; id=&#34;fnref3&#34;&gt;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Open source can always provide more flexibility to the users, developers, and businesses involved. Perhaps we cannot formulate the dynamics of technological evolution in a game-theoretic fashion, as the players in open source are not “competing” in the same way. For example, Anaconda, the company who distributes many Python IDEs (including JupyterLab and Spyder), has long been providing Rstudio IDE on their cloud and GUI. I see “reticulate” as a strategic, but also very harmonious move to create net benefits to the open source community for data science at large.&lt;/p&gt;
&lt;div class=&#34;footnotes footnotes-end-of-document&#34;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&#34;fn1&#34;&gt;&lt;p&gt;This can be easily done in Rstudio. &lt;a href=&#34;https://resources.rstudio.com/webinars/r-rstudio-1-2-amp-python-a-love-story-sean-lopp&#34; target=&#34;_blank&#34;&gt;Webinar by Rstudio&lt;a/&gt; &lt;a href=&#34;https://rstudio.github.io/reticulate/&#34; target=&#34;_blank&#34;&gt;Documentation by Rstudio&lt;a/&gt;&lt;a href=&#34;#fnref1&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&#34;fn2&#34;&gt;&lt;p&gt;The existing python IDEs have not provided an interface for R-Python interoperability yet. Python programmers tend to use JupyterLab, Rodeo, Spyder, Visual Studio Code, and PyCharm, none of which have such an interoperability feature right now.&lt;a href=&#34;#fnref2&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id=&#34;fn3&#34;&gt;&lt;p&gt;For a more detailed analysis: &lt;a href=&#34;https://scholar.google.com/scholar?hl=en&amp;as_sdt=0%2C5&amp;q=Frenemies+in+Platform+Markets%3A+The+Case+of+Apple’s+iPad+vs.+Amazon’s+Kindle&#34; target=&#34;_blank&#34;&gt;Adner, R., Chen, J., &amp;amp; Zhu, F. (2016). Frenemies in Platform Markets: The Case of Apple’s iPad vs. Amazon’s Kindle. Harvard Business School Technology &amp;amp; Operations Mgt. Unit Working Paper, (15-087).&lt;/a&gt;&lt;a href=&#34;#fnref3&#34; class=&#34;footnote-back&#34;&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>