The importance of ontologies for unstructured data warehouse

Assignment Help Management Information Sys
Reference no: EM132272035

Task 1: 130-150 words with reference

Discussion Topic:

Inmon (2011) identified that there are architectural, economical and technical considerations as part of data warehousing. Define two of each and provide examples.

Discussion Post:

There are various considerations that need to be studied when designing a datawarehouse. These considerations can be architecture, technical and economical.

Architecture: The business requirements should be reflected in the datawarehouse design and architecture. Creating a datawarehouse based on a flawed architecture is very serious mistake because it is difficult and costly to rectify once it has been implemented (Khan, 2003). Top down approach is preferred when it is vital for an organization to analyze data from multiple departments. Bottom-up approach is favored when the priority is to serve the analytical needs of individual business functions. Moreover, datawarehouse should be scalable and flexible.

As its usage increases with time coupled with rapid generation of data volume, datawarehouse should be able to scale adequately.
Technical: Datawarehouse implementation should not be governed by technical considerations. Technology, instead of being viewed as the solution, should be used only as an enabler (Khan, 2003). At the same time, a datawarehouse should be implemented using a well-established methodology .

Two of the most common technical considerations are to decide the grain of fact table and update strategy of data. Grain of the fact table is usually governed by the measurements that need to be stored in the fact table (Kimball Group, 2016). Similarly, how frequently to refresh data into datawarehouse is based on the operational requirements and limitations of underlying hardware resources .

Some elements of datawarehouse can be refreshed frequently such as nightly or at real time whereas the other elements such as aggregated and summarized data can be refreshed weekly or monthly.

Economic: Datawarehouse project should only be undertaken after it has been justified by a cost/benefit analysis. A well justified project improves the probability of success and support from business end users (Khan, 2003) .

Based on the economic consideration, decision can be made whether to buy the hardware resources to build the datawarehouse in-house or to use cloud based datawarehouse solution. Conventional datawarehouses which are based on centralized proprietary databases are costly to store vast quantities of data. Therefore, the decision to use datawarehouse for unstructured, sensor generated streaming data can also be driven by cost.

Reference.

Khan, A. (2003). DataWarehousing 101 Concepts and Implementation. Khan consulting and Publishing, LLC. San Jose: CA.

Kimball Group.(2016).Grain.

Reply to Discussion Post (120-150 words with references):

Task 2: 130-150 words with reference

Discussion Topic:

Define the importance of ontologies for unstructured data warehouses. Provide an example of unstructured data and the use of an ontology used to manage this data.

Discussion Post:

Ontologies are groups of ideas inside a specified domain that explains the interrelationship between ideas. It is a structural framework that is usually used to organize information concepts. The use of an ontology is to study the existence of entities in a specific domain and also to identify domain itself.

When it comes with unstructured data, ontologies provides easy navigation when a user moves in the ontology from one concept to another.

Ontologies also helps in improving data management. It can also be extended as relationships and concept matching. Taxonomies and Anthologies also plays an important part in unstructured data. They help process important data analytically on unstructured data.

References

Ontotext. (2018). What are Ontologies?

Reply to Discussion Post (120-150 words with references):

Task 3: 130-150 words with reference

Discussion Topic:

Search the web for an instance involving the use of data mining for cluster or outlier analysis. (One good example is fraud detection). Describe the example and relate what the impact was. Provide the link. Use references and justification to support your point of view.

Discussion Post:

The observation of the world around us naturally gives us the tendency to organize, group, differentiate and catalog what we see in order to understand it better. Similarly, clustering helps marketers to improve their customer base, work on target areas and segment them based on historical purchases, interests or activity.

An instance involving the use of data mining for cluster analysis or outliers is the study of prepaid telecom customers segmentation using k-mean algorithm. The study was carried out on natural persons who are prepaid subscribers. These people do not have a contractual relation with the telecom carrier, and they buy credit in advance.

The analysis excluded the people who failed to recharge within the past three months and did not spend anything on calls, SMS or Internet in three months. The analysis will identify the subscribers' profiles in the overall population and determine the efficiency of the K-mean cluster analysis in the case of high data volumes.

The classification of subscribers into several categories using the following variables: the sum of the amounts recharged in 6 months, the value of the SMS sent within the 6 months, the Internet traffic value in the 6 months and the value of calls made in the 6 months. To group subscribers into segments, the study used the K-Mean Cluster non-hierarchical method. This algorithm follows the segmentation of the populations so that the variation inside the clusters will be down to a minimum.

The analysis pursues the grouping of subscribers into various segments based on their behavioral values (recharge values, call values, SMS expenditure and Internet expenditure). The ANOVA analysis revealed the following order in the case of the factors' contribution to the splitting of the population into groups: recharge value, call value, Internet expenditure and sent SMS value. The value of Sig. was smaller than 0.05 thereby results are significant.

Reference:

Mihai-Florin, Bacila& Al., (2012). Prepaid Telecom Customer Segmentation Using the K-Mean Algorithm.

Reply to Discussion Post (120-150 words with references):

Task 4: 130-150 words with reference

Discussion Topic:

Search the web for an instance involving the use of data mining for cluster or outlier analysis. (One good example is fraud detection).

Describe the example and relate what the impact was. Provide the link. Use references and justification to support your point of view.

Discussion Post:

Clustering is a data mining technique that makes a meaningful or useful cluster of objects which have similar characteristics using the automatic technique. Nowadays technology has become an integral part of the business processes, the process of transfer of information has become more complicated. So, data mining technique might play an important role. There are various industries that use clustering technique to solve some challenge organizing issues.

Stock Market is one good example where clustering technique is applied. According to (Hajizadeh, Davari and Shahrabi, 2010), apply a pair wise clustering approach to the analysis of the Dow Jones index companies, in order to identify similar temporal behavior of the traded stock prices.

The objective of this attention is to understand the underlying dynamics which rules the companies' stock prices. In particular, it would be useful to find, inside a given stock market index, groups of companies sharing a similar temporal behavior.

To this purpose, a clustering approach to the problem may represent a good strategy. To this end, the chaotic map clustering algorithm is used, where a map is associated to each company and the correlation coefficients of the financial time series to the coupling strengths between maps.

The simulation of a chaotic map dynamics gives rise to a natural partition of the data, as companies be-longing to the same industrial branch are often grouped together. The identification of clusters of companies of a given stock market index can be exploited in the portfolio optimization strategies (Hajizadeh, Davari and Shahrabi, 2010).

Reference

Hajizadeh, E., Davari, H. and Shahrabi, J. (2010). Application of data mining techniques in stock markets: A survey. [online] academia.

Reference no: EM132272035

Questions Cloud

The use and implementation of bi in terms of decision-making : In your role as Director of Operations, your communication skills are essential to project completion.
What must the expected return on this stock be : What must the expected return on this stock be? (Do not round intermediate calculations and enter your answer as a percent rounded to 2 decimal places, e.g., 32
Managing information systems and technology : How the article and/or author(s) support your argument(s). Most important aspects of the article as it directly related to your CLO.
How are the main components of government set up : How are the main components of government set up by your state's constitution? Which branch of your state government seems the most powerful, on paper?
The importance of ontologies for unstructured data warehouse : As its usage increases with time coupled with rapid generation of data volume, datawarehouse should be able to scale adequately.
Lives and conventional cash flows : Aesop has to choose between two mutually exclusive projects that have 5-year lives and conventional cash flows.
Option to borrow a fixed rate us mortgage : Suppose that you are given the option to borrow a fixed rate US mortgage of $80,000 at 12% for 25 years with monthly payments.
Write an analytical essay of a published article : You must write an analytical essay of a published article (provided below). You must evaluate the overall persuasiveness of the argument made in the article.
Prepare a reflection paper on it and cybersecurity : Prepare a reflection paper on IT and cybersecurity presentations from the following 9 organizations.

Reviews

Write a Review

Management Information Sys Questions & Answers

  Advantage to integrating a computerize

Suggest a significant advantage to integrating a computerized physician order entry system (COPE) and a clinical decision support system (CDSS), as part of an organization's EHR.

  Write a research paper that related to business intelligence

Write a research paper in APA format on a subject of your choosing that is related to Business Intelligence. Integrate what you have learned from the course resources into your document.

  Devise a plan to assess the current challenges

Read the case study titled "High Quality Healthcare: A Case of Cost and Quality,". Next, devise a plan to assess the current challenges facing the hospital.

  Define the service level agreement

Your customer has asked you to define the Service Level Agreement (SLA).Write a 2- to 3-page paper that defines the components required in the SLA.

  Write paper on office of management and budgets

Write Research Paper on given topics: Office of Management and Budget's (OMB) and Federal Information Security Management Act (FISMA)

  Define criteria for success for your application

Define criteria for success for your application. Assess how likely your application is to achieve success. Compare the potential benefit of your social networking application to the facilities management and scheduling application.

  Analyze the term risk appetite

Assignment: Organizational Risk Appetite and Risk Assessment, Analyze the term "risk appetite". Then, suggest at least one (1) practical example in which it applies

  Write your conclusion of written report for the it solution

Finishing the IT Solution and Reflecting on the IT Solution's Development. Write your conclusion of the written report for the IT solution.

  Why did calgary drop-in and rehab centre need a new database

Why did Calgary Drop-In and Rehab Centre need a new database? What did Facey hope to achieve from the new database (in terms of its impact on business)?

  Define information and communication technology

Define Information and Communication Technology (ICT), and explain why it is a very important component of cyber security.

  Develop the forensics and csirt plan strategy

Develop the forensics and CSIRT plan strategy for Applied Predictive Technologies Company.

  Discuss about agile or agile project management

Discuss the one thing about Agile/Agile project management that excites you about Agile/Agile project management after completing this course.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd