Which three authors have the highest total degree

Assignment Help Management Information Sys
Reference no: EM131768860

Task steps:

1. Create an author-to-author tweet edge file from the original data set, stocktwit_graph_input.csv.

Create an edge file from the original data set, stocktwit_graph_input.csv. We just need two columns - source (Vertex 1) and target (Vertex

2) of an edge to create a graph. Select all rows - tweets for columns K- "from_person" and M - "to_person" (or J and L for numerical author IDs) and save it as "stocktwit_from_to" or another name you prefer.

2. Use Gephi to generate and save author (node) metrics. Select the metrics you like to explore and use for building models later. Include at least 5 different metrics.

a. Which three authors have the highest betweenness centrality?

b. Which three authors have the highest total degree?

c. Which three authors have the highest closeness?

3. Build the Node Table for Prediction

(1). Open the stocktwit_node.csv file in Excel, and create a new variable: Expert (i.e. suggested). It is the target variables we aim to classify or predict.

(2). Do not close the stocktwit_node.csv file. Open the stocktwit_graph_input.csv file. And then go to the stocktwit_node.csv.

(3). Note that the unit in the stocktwit_node.csv file is a node (i.e. each individual author) and the unit in the stocktwit_graph_input.csv file is a tweet (i.e. each message). So, in order to transfer the value of expert from the table of stocktwit_graph_input to the stocktwit_node table, we need to do data transformation.

To Expert, we need to assign one value to one author (i.e. whether they are expert or not - 1 stands for yes; 0 stands for no.).

Use the VLOOKUP function to assign the value of "suggested" from the table of stocktwit_graph_input to the column, "Expert", in stocktwit_node table. The function for the first row should be like this:

= VLOOKUP(A2, stocktwit_graph_input.csv!$K$1:$AB$38200,18,FALSE),

where "A2" is the node name; "stocktwit_graph_input.csv!$K$1:$AB$38200" is the table range we look up; 18 is the column number from the table range that we aim to return the value, "FALSE" stands for an exact match of the value.

(4). Save the stocktwit_node.csv file. BTW, you can delete those rows who have missing value in Expert, because these nodes only appear in the "to_person" column, they do not have tweets.

Use filter function in excel to remove the #NAs.

4. In R, build and evaluate a classification model that uses the metrics in stocktwit_node_yourname.csv from step 2 as features to classify authors into "expert" stocktwit author (i.e., "suggested"=1)" or not ("suggested"=0) which is the target label variable.

(1). Using a seed of 100, randomly select 60% of the rows into training (e.g. called traindata). Divide the other 40% of the rows evenly into two holdout test/validation sets (e.g., called testdata1 and testdata2).

(2). Build the tree using the C50 function with default settings.

(3). Generate predictions (i.e. estimations) of the values of the target variable for the testing instances.

Generate a confusion matrix that shows the counts of true-positive, true-negative, false-positive and false-negative predictions for both testdata1 and testdata2. Consider 1 as positive class.

Generate seven performance metrics - Accuracy (percent of all correctly classified testing instances), and precision (percent of instances predicted to have a class are accurate), recall (also true positive) and F-measure (also F-score) of the two classes of expert.

(4). Would you recommend using the features from network analysis to identify experts in the Stocktwit community? Why or why not?

Attachment:- stocktwit_graph_input.rar

Reference no: EM131768860

Questions Cloud

Journalize the entry to record the factory labor cost : Journalize the entry to record the factory labor cost and journalize the entry to apply factory overhead to production for august
Discuss jones company has employed a bookkeeper : Jones Company has employed a bookkeeper who is inexperienced. On December 28, after reviewing the records for the year
Analyze the scenario to determine necessary consideration : Identify the program you plan to develop and analyze the scenario to determine necessary consideration for building your program.
Prepare the march income statement of lae manufacturing : Now prepare the march income statement of lae manufacturing company and determine the inventory balances at the end of the first month of operations
Which three authors have the highest total degree : Create an author-to-author tweet edge file from the original data set.Which three authors have the highest total degree?
Discuss the basics of financial statement analysis : The purpose of this assignment is to help you understand the basics of financial statement analysis using financial ratios on the assets section of the balance
Compute the predetermined manufacturing overhead rate : Regal Company produces hospital uniforms. Compute the predetermined manufacturing overhead rate. Post actual and allocated manufacturing overhead
Where is your configuration for the routing protocol : What routing protocol did you chose and why? Where is your configuration for the routing protocol? Where is your summarization configuration?
Define evidence of a contract of carriage : Identify the following instruments: a. A written promise by a bank to repay money received from a depositor. b. A written promise to pay another a certain.

Reviews

Write a Review

Management Information Sys Questions & Answers

  Information technology and the changing fabric

Illustrations of concepts from organizational structure, organizational power and politics and organizational culture.

  Case study: software-as-a-service goes mainstream

Explain the questions based on case study. case study - salesforce.com: software-as-a-service goes mainstream

  Research proposal on cloud computing

The usage and influence of outsourcing and cloud computing on Management Information Systems is the proposed topic of the research project.

  Host an e-commerce site for a small start-up company

This paper will help develop internet skills in commercial services for hosting an e-commerce site for a small start-up company.

  How are internet technologies affecting the structure

How are Internet technologies affecting the structure and work roles of modern organizations?

  Segregation of duties in the personal computing environment

Why is inadequate segregation of duties a problem in the personal computing environment?

  Social media strategy implementation and evaluation

Social media strategy implementation and evaluation

  Problems in the personal computing environment

What is the basic purpose behind segregation of duties a problem in the personal computing environment?

  Role of it/is in an organisation

Prepare a presentation on Information Systems and Organizational changes

  Perky pies

Information systems to adequately manage supply both up and down stream.

  Mark the equilibrium price and quantity

The demand schedule for computer chips.

  Visit and analyze the company-specific web-site

Visit and analyze the Company-specific web-site with respect to E-Commerce issues

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd