The University of Queensland**We aren't endorsed by this school
Course
BISM 1201
Subject
Information Systems
Date
Jan 5, 2025
Pages
18
Uploaded by SuperHumanTreeOryx48
SummaryData warehouse-central repository for an organizations entire data collection. It gathers information from various sources (transaction systems); marketing, finance, CRM systems-structured format, comprehensive analysisdata mart-focused subset of a data warehouse-specific needs of a particular department or business unit-sales figures/ marketing campaign results-pre summarized, filtered, optimized for quick analysis by departmental userskey differencesscope: -DW are broad encompassing data from the entire organization-Data marts are arrow, targeting a specific department or business functionUsers-data warehouses serve various departments across the company-data marts cater to specific needs of a particular departmentdata:-dw store detailed, raw data.-Data marts typically hold summarized and pre processed data for faster analysisSupervised ML-You provide the machine/computer with labelled data.-The computer analyses these examples and learns to identify patterns-Requires labelled data (predefined categories). The model learns from examples with known outcomes to classify new, unseen data points. (classifying emails as spam or important)Unsupervised learning
-Give the computer a pile of unlabelled pictures and asking them to group similar ones. -It finds hidden patterns within-Grouping days with similar temperatures-The model identifies patterns and relationships within the data itself, without predefined categoriesBI-Techniques or methods of transforming raw data into actionable insights-Used to make better business decisions-Data visualization, data mining and OLAP1. gather ingredients: sales figures, customer information, inventory levels2. mix and bake (data analysis): tools analyze the data to find patterns and trends3. serve insights (visualization): presents the results in easy to understand formats lie charts and graphs ‘BI might reveal high ingredients cost’ because croissants are popular but profits of the bakery are low. You can adjust the recipe or source ingredients more efficiently.BI transforms data into valuable knowledge for a thriving bakery.Benefits:- informed decisions: BI helps you understand your business better, leading to smarter choices-Improved performances: identify areas for improvement and track progress-Competitive advantage: gain insights to stay ahead of the competitionData mining:Association: finding frequent pairings; identifies relationships between data points, uncovering items frequently bought together; recommends complementary productsClustering: grouping similar data
‘organising shells on a beach. Clustering groups data points with similar characteristics/features.’; grouping customers based on their purchase history (high-spenders/budget)Classification: sorting into categories (supervised learning)-Imagine sorting laundry. Classification assigns data points into predefined categories./ based on existing labels.; classifying emails as spam/imp.Sequencing (unsupervised learning)-Think of traffic light patterns. Sequencing analyses the order in which events occur.-Discovers sequential patterns in data, understanding the order of events,-Analysing website clickstream data to understand user behaviour (which pages they visit and in what order)Forecasting (unsupervised and supervised)Imagine a weather forecast. Forecasting uses historical data to predict future trends.Analyses trends and patterns to predict future outcomes.Predicts sales figures based on historical data and market trends.Manage inventory levels/ plan marketing campaigns.OLAP cubeOrganize and analyse data efficientlyA multi layered puzzle cubeThere are different categories you can analyse your data by.Faces of the cube-Product -Location-Time-Measures: values you want to analyse, like actual numbers. Imagine the data cubes within each section holding these specific details.
Imagine you want to see which bread sells the best downtown. With an OLAP cube, you can easily focus on the downtown section of the product dimension and see the sales figures for each type of bread. This helps identify top sellers and adjust your inventory accordingly.Benefits of OLAP cube:-Fast analysis: the data is pre calculated and summarized for different combinations, allows quick analysis.-Multidimensional view: you can slice and dice the data by focusing on specific combinations; sourdough sales in the suburbs this week-Improved decisions: by seeing how different factors affect sales you can make better decisions; stocking more popular bread types in different locations.Market based analysisPlanning your business journeyIt involves studying the market environment to understand your target audience, competitors… overall landscape. -Customer analysis-Competitor analysis-Market trendsBenefits-Informed decision making: better decisions about product dev, pricing, market strategies-Identifying opportunities: market analyses helps you spot potential gaps in the market and identify new business opportunities-Staying competitive: by analysing competitor strategies and market trends, you can stay ahead of the curve and adapt your business accordingly.-Customer needs, competitors, market trendsAI machines mimick human cognitive functions like learning and ps
Neural network: is an AI technique inspired by the brain’s structure using layers of interconnected nodes to process information and data.Expert system: uses knowledge/rules to solve a problem. Captures expert knowledge in a specific domain to solve problems.Blockchain: is a distributed ledger tech that securely stores data in a way that’s transparent and tamper proof Hadoop: open source framework for storing and processing large datasets across clusters of computers. A large warehouse with rows of computer servers.
BISM1201 SEM1 2021 PAST PAPERQ1-A customer can only have one loyalty level, but a loyalty level can have many customers.-Each loyalty level must have one agent who will manage customers that have that level (an agent can look after many loyalty levels)Foreign keys: -Customer table has loyalty_id with references to loyalty table’s loyalty_id -Loyalty table will have agent_id with references to customeragent table’s agent_idQ2a. explain the problem with the table design: there is a lot of redundancy here.- notice how Lucy Xi has appeared 3 times in the table for 1 order just because she has ordered 3 different items.- duplication of addresses for the same customer.b. normalize the data. Create a table for customer, order and item.
- customer: cust_id (PK), name, address- order: cust_id (FK), order_number (PK), order_date, item_id (FK), quantity, price- item: item_id (PK), item_nameQ3.a. A customer can have a minimum of 0 orders (0 to many)b. an order can have one and only one customerc. lookup entities: movie, bookd. linking entities: order_detailse. it would result in a lot of NULL values because not all attributes fully relate to each product type. For example, only movies have a release_date and duration while only books have ISBN and genre. It is generally considered best practice to avoid creating NULL values as much as possible as it can cause a lot f database problems.Q4.1. on the ERD, the zero to many relationship between actor and agent should actually be modelled to ‘an actor can have one and only one agent’. Q5.a. Hadoop: an open-source software framework for storing data and running applications on clusters of off the shelf hardware.d. data warehouse is designed to analyse data derived from transactional sources for business intelligence.b. data mart is a very specific portion of the organizations data for a specific population of users.c. RDBMS is not structured to handle analytics well.Q6.1. a data warehouse collects data from many sources including point of sales, ERPs, legacy systems, external web documents which allow ‘New Stuff’ to perform analyses and extract insights their data, monitor business performance and improve decision making.
2. a data warehouse separates analytics processing form transactional databases, so ‘New Stuff’s’ transactional databases can continue to run at optimum performance for their daily operations while they start running business intelligence and analytics using the data warehouse.Q7.Data mining-Provides insights into business data by finding hidden patterns and relationships in large databases-What the manager is describing is data mining for associationsQ8.Expert system is a set of rules that are specified which are followed by the system. They capture the expertise of a human in a limited domain of knowledge..SML is a system trained by giving it specific examples of desired inputs and outputs identified by humans in advance. They learn by example and improve their decision making or predictive accuracy over time.USML: the system processes data and reports whatever it finds. Humans do not feed the system examples.Ai refers to computer systems that think and act like humans. The simulation human intelligence by machines.Q9.-The manager already has a collection of verified pictures that can be used to inform the machine learning system. Unsupervised ML does not involve a human giving the ML examples, rather the system identifies its own patterns and puts tags on it itself. SML is provided with tags. Etc…Q10.1. centralized customer data: a CRM system, provides a centralized database where all customer information, interactions and transactions are stored. This enables
managers to have a comprehensive view of their customers’ history, preferences and needs in one place.2. improved customer relationships by having access to detailed customer profiles and interaction history, managers can better understand their customers’ needs and preferences. This enables them to tailor communication and offerings to meet individual customer needs, leading to improved customer satisfaction and loyalty.3. enhanced communication: CRM systems often include communication tools such as email integration and automated messaging, allowing managers to communicate with customers more efficiently and effectively. They can send targeted messages, follow up on inquiries, and provide personalized support, fostering stronger relationships.4. streamlined sales processes: crm systems can automate many sales-related tasks, such as lead management, pipeline tracking and quotation generation. This streamlines the sales process, reduces administrative overhead, and enables sales teams to focus more on closing deals and building relationships.b. three examples of questions that a CRM system can help answer:customer behaviour analysis: what products or services are our customers purchasing most frequently?Which marketing campaigns or channels are driving the highest customer engagement and conversions?Are there any patterns or trends in customer interactions that indicate potential upsell or cross sell opportunities?Customer satisfaction measurement: what is the overall satisfaction rating of our customers based on recent interactions or surveys?Are there any unresolved customer issues or complaints that need immediate attention?
QUICK CARDSNeural networks-Humans train the network by feeding it a set of outcomes they want the machin to learn.-A neural network works by using an internal, hidden layer of logic that examines existing data and assigning a classification to that set of data.Types of AI-Expert systems-Machine learning -Neural networks-Genetic algorithms-Natural language processing-Computer vision systems-Robotics-Intelligent agents; chatbotsPredictive analytics-PA can use the big data generated from social media, consumer transactions, sensor and machine output.-Uses statistical analytics, datamining, historical data; assumptions of future conditions-Extracts information from data to predict-Slack tech (cloud based team collab software) which has 10 mil users, uses PA to identify customers who are most likely to use its products frequently and pay for upgrade to paid services.Business value of improved decision making-Good database design + good data + good business intelligence infrastructure + useful analysis = better decision making-Possible to measure value of improved decision making-Decisions made at all levels of the firm-Some are common, routine, numerous
-Although value of improving any single decision may be small, improving any hundreds of thousands of small decisions adds up to large annual value for the business.Web mining-Discovery and analysis of useful patterns and information from the web -To understand customer behaviour, evaluate website, quantify success of marketingContent mining-Mines content of websites-Extracting knowledge from the content of we pages (text, audio, images)Structure mining-Mines website structural elements, such as links-Links pointing to a document indicate the popularity of the document-Links going out of the document indicate the variety of topics covered in a documentUsage mining-Mines user interaction data gathered by web servers-Who clicked where, who searches what, how long people spend on a page, what search terms they used-Analysing such data can help companies determine the value of particular customers, cross marketing strategies, across products, and the effects of promotional campaigns.Text mining-Unstructured data-Allows business to extract key elements from, discover patterns in, and summarize large unstructured data sets.-Sentiment analysis-Mines online text comments online or in email to measure customer sentiment-Extract key elements from unstructured data sets , discover patterns and relationships, and summarizes the information
Data mining forecasting-Forecasting uses predictions in a different way. It uses a series of existing values to forecast what other values will be.-Forecasting finds patterns in data to help managers estimate the future of value continuous variables such as sales figuresData mining clustering-Similar to classification when no groups have not yet been defined-A data mining tool can discover different groupings within data such as finding affinity groups for bank cards or partitioning a database into groups of customers based on demographics and type of personal investments-Like a cluster analysis-There are a wide range of customers that buy bikes. Clustering can allow you to cluster groups of people based on their demographics and/or type of bike they buy.DM classification-Recognizes patterns that describe the group to which an item belongs, by examining existing items that have been classified and inferring a set of rules.-A mobile phone company worries about the loss of customers. Classification helps to discover the characteristics of customers who are likely to leave and can provide a model to help managers predict who those customers are so that managers can devise special campaigns to retain such customers.-A bank loan officer wants to analyse the loan applicant data in order to know which applicants are risky, safe.DM sequences-Events linked over time-Seeks to identify similar patterns, regular events or trends in transaction data over a period-With historical transaction data a business can identify a set of items that customers buy together different times in a year. Then business can use this information to recommend customers buy it with better deals based in their purchasing frequency in the past.
DM associations-Best known data mining tech-A pattern is discovered based on a relationship between items in the same transaction.-A study of supermarket purchasing patterns reveals that when corn chips are purchased, a cola drink is purchased 65% of the time.-But when there is a promotion a cola drink is purchased 85% of the timeData associations: occurrences linked to a single eventSequences: events linked over timeClassifications: patterns describing a group an item belongs toClustering: discovering as yet unclassified groupingsForecasting: uses series of values to forecast future valuesData mining-OLAP answers queries such as ‘compare sales of product 403 relative to plan by quarter and sales for the past two years’-Data mining is more discovery driven-Provides insights into business data that cannot be obtained with OLAP by finding hidden patterns and relationships in large databases and inferring rules form them to predict future behaviour-The patterns and rules are used to guide decision making and forecast the effect of those designs.OLAPDimensions: anything that can consistently categorize dataMeasures: numerical values that can be added to provide meaning to dimensionsHierarchies: refers to granularity of the data- very relevant when we drill down into the dataStrengths:-Speedy provision of quality data-Increased opportunity to make more use of dataWeaknesses:Heavy reliance on IT department
Challenge of change-managing the type and number of queriesMultidimensional data model-This view shows product vs region-If you rotate 90 degrees, the face that will show is product vs actual and projected sales.-If you rotate the cube 90 degrees again, you will see region vs actual and projected sales-Supports multidimensional data analysis, enabling users to view the same data in different ways using multiple dimensions.Data warehouse is an archive where historical corporate data is stored and can be analysed then. It can use different technologies for data extraction and analyzing. And OLAP is one of those TECH that analyze and evaluate data from the data warehouse.OLAP-OLAP tech provides a powerful and user-friendly environment for analysing data, gaining insights and exploring business trends.-It is widely used in BI systems to support reporting, ad hoc analysis and decision making process.-The production database stores all transaction data. The database is mission critical and has very little capacity for additional load-An OLAP cube s a snapshot of data at a specific point in time, perhaps at the end of a specific day, week, month or year,-The business analysts and managers – the users of the business intelligence then use the OLAP cube/db-Win win; the production database is not overloaded, and end-users have the data or BI they need for analysis and decision making.Hadoop vs Trad database-Hadoop cant be called a database-It is more of a distributed file system that can store and process a huge volume of data sets across a cluster of computers
-RDBMS is a structured database approach in which data is stored in rows and columns which can be updated with SQL and presented in different tables.Unstructured data-Text data, socmed comments, call transcripts, log files, images, audio, videos-Data that doesn’t fit in a spreadsheet with rows and columnsStructured data-Data that fits nearly within fixed fields and columns in relational databases and spreadsheets.-Names, student exam results, customer data, credit card numbers, geolocation, geolocation etc.HADOOP-Hadoop is an open source framework that is sued to efficiently store and process large datasets ranging in size from giga to petabytes of data.-Instead of using one large computer to store and process the data, Hadoop allows clustering multiple computers to analyse massive datasets in parallel more quickly.-Structured and unstructured-Breaks data into sub problems & distributes the processing to many inexpensive computer processing nodesData mart-A subset of a data warehouse in which a summarized or highly focused portion of the organization’s data is placed in a separate database for a specific population of users.-Choose to collect sales and marketing data to focus in customer information and profiles. Businesses with different sales channels may choose to eep data collection and use separate.Benefits of data warehouse-Informed decision making-Consolidated data from many sources
-Historical data analysis-Data quality, consistency, and accuracy-Separation of analytics processing from transactional databases which improves performancesData warehouse-A central repository of information that stores current and historical data that can be analysed to make more informed decisions.-Consolidates data flows into a data warehouse from transactional systems, relational databases and other sources typically at regular intervals.-A dw is specifically designed for data analytics, which involves reading large amounts of data to understand relationships and trends across the data.BI-Helps users make better business decisions-Leverages software and services to transform data into actionable insights that inform an organization’s strategic and tactical business decisions. -Access and analyse data sets and present analytical findings in reports, summaries, dashboards, graphs, charts-Provides users with detailed intelligence about the state of the business.Blockchain-Distributed database of transactions-Operates on a network without central authority-Maintains a growing list of records called blocks -Once recorded, blocks cannot be changed -Reduces cost of processing transaction and enhances security