国产人妻人伦精品_欧美一区二区三区图_亚洲欧洲久久_日韩美女av在线免费观看

合肥生活安徽新聞合肥交通合肥房產(chǎn)生活服務(wù)合肥教育合肥招聘合肥旅游文化藝術(shù)合肥美食合肥地圖合肥社保合肥醫(yī)院企業(yè)服務(wù)合肥法律

TCS3393 DATA MINING代做、代寫Python/Java編程

時(shí)間:2024-03-24  來源:合肥網(wǎng)hfw.cc  作者:hfw.cc 我要糾錯(cuò)



FACULTY OF ENGINEERING, BUILT-ENVIRONMENT, AND INFORMATION
TECHNOLOGY (FOEBEIT)
BACHELOR OF INFORMATION TECHNOLOGY (HONS)
JANUARY-MAY 2024 INTAKE
TCS3393 DATA MINING
GROUP ASSIGNMENT [2-3 members per group]
This assignment is worth 25% of the overall marks available for this module. This assignment
aims to help the student explore and analyse a set of data and reconstruct it into meaningful
representations for decision-making.
The online landscape is ever-evolving, with websites serving as crucial assets for businesses,
organizations, and individuals. As the internet continues to grow, the need for accurate and
efficient website classification becomes paramount. Understanding the nature of websites, their
content, and the user experience they provide is vital for various purposes, including online
security, marketing strategies, and content filtering.
Embarking on a data science project, you collaborate with a cybersecurity firm dedicated to
enhancing web security measures. The firm provides you with a rich dataset encompassing
various attributes of websites, including their URLs, user comments, and assigned categories.
Your objective is to develop a classification model capable of accurately categorizing websites
based on these variables.
The dataset includes information on the URLs of different websites, user comments associated
with those websites, and pre-existing categories assigned to them. The challenge lies in creating
a model that not only accurately classifies websites but also adapts to the dynamic nature of the
online environment, where new types of websites constantly emerge.
Introduction
2
Your goal is to implement advanced data analysis techniques to train a model that enhances the
efficiency of web classification.
Techniques
The techniques used to explore the dataset using various data exploration, manipulation,
transformation, and visualization techniques are covered in the course. As an additional feature,
you must explore further concepts which can improve the retrieval effects. The datasetprovided
for this assignment is related to the website classification.
Dataset
This dataset contains information on 1407 websites URL. It includes 3 variables that describe
various categories of websites. The dataset will be analyzed using subsets of these variables for
descriptive and quantitative analyses, depending on the specific models used.
Objective:
Develop a classification model to categorize websitesusing advanced data science techniques.The
model should robustly classify the website based on comments stated in the dataset.
Tasks:
1. Data Exploration:
• Conduct an initial exploration of the dataset to understand its structure, size, and
variables.
• Examine the distribution of website categories to identify any imbalances in the
dataset.
• Explore the distribution of URLs and user comments length to gain insights into the
data.
Assignment Task: Websites Classification
3
2. Descriptive Analysis:
A. Basic Exploration:
• Describe the structure of the dataset. How many observations and variables
does it contain?
• What are the data types of the variables in the dataset?
B. Statistical Summary:
• Provide a statistical summary of the 'Category' variable. What are the most
common website categories?
• Calculate basic descriptive statistics (mean, median, standard deviation) for
relevant numeric variables.
C. URL Analysis:
• Analyze the distribution of website URLs. Are there any patterns or
commonalities?
• Are there any outlier URLs that need special attention?
3. Data Preprocessing:
A. Cleaning Text Data:
• Explore the 'cleaned_website_text' variable. What preprocessing steps would
you take to clean text data for analysis?
• Implement text cleaning techniques and explain their importance in preparing
data for text-based analysis.
B. Handling Missing Values:
• Identify if there are any missing values in the dataset. Propose strategies for
handling missing values, specifically in the 'cleaned_website_text' column.
4. Visualization:
A. Category Distribution Visualization:
• Create a bar chart or pie chart to visually represent the distribution of website
categories.
• How does the visualization help in understanding the balance or imbalance of
the dataset?
B. Text Data Visualization:
• Generate word clouds or frequency plots for the 'cleaned_website_text'
variable. What insights can be gained from these visualizations?
4
5. Model Development
A. Data Mining Analysis:
• Split the dataset into training and testing sets for model evaluation.
• Implement various machine learning algorithms for classification, such as logistic
regression, decision trees, or random forests.
B. Training and Evaluation
• Evaluate the performance of each model using metrics like accuracy, precision, recall,
and F**score.
• Discuss the challenges and considerations specific to evaluating a model for website
classification.
6. Advanced Techniques:
i. Feature Engineering:
• Propose additional features that could enhance the model's performance.
How might these features capture more nuanced information about websites?
ii.Dynamic Nature of Websites:
• Given the dynamic nature of the online environment, how could the model
adapt to newly emerging website types? Discuss strategies for model
adaptation.
7. Create Dashboard, Report and Conclusions:
• Summarize the findings, including insights gained from exploratory data analysis and
the performance of the classification model.
• How interpretable is the chosen model? Can you explain the decision-making process
of the model in the context of website classification?
• Provide recommendations for further improvements or considerations in the dynamic
landscape of web classification.
• Reflect on the challenges encountered during the analysis. What potential
improvements or future work would you recommend to enhance the model's
performance?
This assignment allows students to apply knowledge of data exploration, preprocessing, data
modelling, and model building to solve a real-world problem in the business domain. It also
encourages them to explore additional concepts for improving model performance.
5
• The complete Python program (source code (ipynb)) and report must be submitted to
Blackboard.
• Python Script (Program Code):
o Name the file under your name and SUKD number.
o Start the first two lines in your program by typing your name and SUKD
number. For example:
# Nor Anis Sulaiman
#SUKD20231234
o For each question, give an ID and explain what you want to discover. For example:
a. Explore the distribution of website categories in the dataset. Are there any specific
categories that are more prevalent than others?
b. Visualize the distribution of URL lengths and user comments lengths. Are there patterns
or outliers that could be informative for the classification model?
c. What steps would you take to clean and preprocess the URLs and user comments for
effective analysis?
d. How might you handle any missing values in the dataset, and what impact could they
have on the classification model?
e. Provide descriptive statistics for key variables such as URL lengths and user comments
lengths. What insights can be derived from these statistics?
f. Explore potential additional features that could enhance the model's ability to classify
websites accurately.
g. How might the inclusion of features derived from URLs or user comments contribute
to the overall model performance?
h. Choose a classification algorithm suitable for website classification. Explain your
choice.
i. Implement the chosen algorithm using Python and relevant libraries. What
considerations should be taken into account during the model implementation phase?
j. Split the dataset into training and testing sets. How would you assess the performance
of the model using metrics like accuracy, precision, recall, and F**score?
k. Discuss potential challenges in evaluating the model's effectiveness and generalization
to new websites.
l. Create visualizations to interpret the model's predictions and showcase its classification
performance.
Deliverables
6
As part of the assessment, you must submit the project report in printed and softcopy form,
which should have the following format:
A) Cover Page:
All reports must be prepared with a front cover. A protective transparent plastic sheet can be
placed in front of the report to protect the front cover. The front cover should be presented with
the following details:
o Module
o Coursework Title
o Intake
o Student name and ID
o Date Assigned (the date the report was handed out).
o Date Completed (the date the report is due to be handed in).
B) Contents:
• Introduction and assumptions (if any)
• Data import / Cleaning / pre-processing / transformation
• Each question must start in a separate page and contains:
o Analysis Techniques - data exploration / manipulation / visualization
o Screenshot of source code with the explanation.
o Screenshot of output/plot with the explanation.
o Outline the findings based on the results obtained.
• The extra feature explanation must be on a separate page and contain:
Documents: Coursework Report
7
o Screenshot of source code with the explanation.
o Screenshot of output/plot with the explanation.
o Explain how adding this extra feature can improve the results.
C) Conclusion
• Depth and breadth of analysis
• Quality and depth of feedback on the analysis process
• Reflection on learning and areas for improvement
D) References
• The font size used in the report must be 12pt, and the font is Times New Roman. Full
source code is not allowed to be included in the report. The report must be typed and
clearly printed.
• You may source algorithms and information from the Internet or books. Proper
referencing of the resources should be evident in the document.
• All references must be made using the APA (American Psychological Association)
referencing style as shown below:
o The theory was first propounded in 1970 (Larsen, A.E. 1971), but since then has
been refuted; M.K. Larsen (1983) is among those most energetic in their
opposition……….
o /**Following source code obtained from (Danang, S.N. 2002)*/
int noshape=2;
noshape=GetShape();
• A list of references at the end of your document or source code must be specified in the
following format:
Larsen, A.E. 1971, A Guide to the Aquatic Science Literature, McGraw-Hill, London.
Larsen, M.K. 1983, British Medical Journal [Online], Available from
http://libinfor.ume.maine.edu/acquatic.htm (Accessed 19 November 1995)
Danang, S.N., 2002, Finding Similar Images [Online], The Code Project, *Available
from http://www.codeproject.com/bitmap/cbir.asp, [Accessed 14th *September 2006]
Further information on other types of citation is available in Petrie, A., 2003, UWE
Library Services Study Skills: How to reference [online], England, University of
請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

掃一掃在手機(jī)打開當(dāng)前頁
  • 上一篇:ECM1410代做、代寫java編程設(shè)計(jì)
  • 下一篇:代做CS 550、代寫c++,Java編程語言
  • 無相關(guān)信息
    合肥生活資訊

    合肥圖文信息
    流體仿真外包多少錢_專業(yè)CFD分析代做_友商科技CAE仿真
    流體仿真外包多少錢_專業(yè)CFD分析代做_友商科
    CAE仿真分析代做公司 CFD流體仿真服務(wù) 管路流場仿真外包
    CAE仿真分析代做公司 CFD流體仿真服務(wù) 管路
    流體CFD仿真分析_代做咨詢服務(wù)_Fluent 仿真技術(shù)服務(wù)
    流體CFD仿真分析_代做咨詢服務(wù)_Fluent 仿真
    結(jié)構(gòu)仿真分析服務(wù)_CAE代做咨詢外包_剛強(qiáng)度疲勞振動(dòng)
    結(jié)構(gòu)仿真分析服務(wù)_CAE代做咨詢外包_剛強(qiáng)度疲
    流體cfd仿真分析服務(wù) 7類仿真分析代做服務(wù)40個(gè)行業(yè)
    流體cfd仿真分析服務(wù) 7類仿真分析代做服務(wù)4
    超全面的拼多多電商運(yùn)營技巧,多多開團(tuán)助手,多多出評(píng)軟件徽y1698861
    超全面的拼多多電商運(yùn)營技巧,多多開團(tuán)助手
    CAE有限元仿真分析團(tuán)隊(duì),2026仿真代做咨詢服務(wù)平臺(tái)
    CAE有限元仿真分析團(tuán)隊(duì),2026仿真代做咨詢服
    釘釘簽到打卡位置修改神器,2026怎么修改定位在范圍內(nèi)
    釘釘簽到打卡位置修改神器,2026怎么修改定
  • 短信驗(yàn)證碼 寵物飼養(yǎng) 十大衛(wèi)浴品牌排行 suno 豆包網(wǎng)頁版入口 目錄網(wǎng) 排行網(wǎng)

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 hfw.cc Inc. All Rights Reserved. 合肥網(wǎng) 版權(quán)所有
    ICP備06013414號(hào)-3 公安備 42010502001045

    国产人妻人伦精品_欧美一区二区三区图_亚洲欧洲久久_日韩美女av在线免费观看
    国产精品一区二区久久| 精品一区二区中文字幕| 人妻av无码专区| www国产精品内射老熟女| 国产精品视频免费在线观看| 日韩 欧美 高清| 9191国产视频| 亚洲一区二区三区久久| 国产美女网站在线观看| 不卡伊人av在线播放| 欧美极品欧美精品欧美| 久久久久久久久久久久久9999 | 国产va免费精品高清在线观看| 国产精品成人观看视频免费| 欧美精品尤物在线| 久久久精品久久久久| 日韩免费在线观看av| 久久人人九九| 日韩亚洲一区在线播放| 国产l精品国产亚洲区久久| 日韩中文字幕亚洲精品欧美| 91麻豆蜜桃| 亚洲二区自拍| 国产精品6699| 日韩一区二区三区高清| 8090成年在线看片午夜| 亚洲aa中文字幕| 国产高清精品在线观看| 日本在线观看一区| 久久99国产精品一区| 欧美一区二区中文字幕| 国产精品久久久久久久久婷婷| 麻豆av免费在线| 欧美日韩电影在线观看| 国产欧美va欧美va香蕉在线| 欧美激情国产日韩精品一区18| 国产精品一区二| 都市激情久久久久久久久久久| 国产极品jizzhd欧美| 日韩人妻精品无码一区二区三区| 日韩有码视频在线| 欧美成人蜜桃| 欧美激情视频网址| 91国在线精品国内播放| 日韩免费高清在线| 国产精品看片资源| 国产伦精品一区二区三区精品视频| 久久久久久97| 国产成人a亚洲精v品无码| 精品人妻少妇一区二区| 欧美日韩高清区| 国产极品尤物在线| 欧美专区一二三| 精品国产一区二区三区无码 | 国产免费一区二区三区| 一区不卡字幕| 色偷偷888欧美精品久久久| 麻豆久久久9性大片| 亚洲一区免费看| 久久精品国产亚洲精品2020| 国产裸体舞一区二区三区| 日本精品视频网站| 美女av一区二区三区| 国产黄色特级片| 国产在线视频91| 日韩在线国产| 国产精品欧美一区二区三区奶水| 国产在线资源一区| 熟女少妇精品一区二区| 国产精品久久久对白| www.九色.com| 韩国精品一区二区三区六区色诱| 亚洲色婷婷久久精品av蜜桃| 日韩视频在线一区| 99免费在线观看视频| 欧美 国产 精品| 日韩中文字幕亚洲精品欧美| 不用播放器成人网| 色av中文字幕一区| 99久久自偷自偷国产精品不卡| 欧美精品一区二区三区在线看午夜| 伊人久久大香线蕉成人综合网| 国产ts一区二区| 成人av在线天堂| 欧美 日韩 国产在线| 欧美一级视频在线观看| 一区二区视频在线播放| 国产精品视频免费观看www| 国产黄色特级片| 成人精品小视频| 精品一区二区三区免费毛片| 欧美一级日本a级v片| 一区二区在线不卡| 久久中文久久字幕| 国产成人久久婷婷精品流白浆| 91久久精品日日躁夜夜躁国产| 欧美日韩国产免费一区二区三区| 欧美精品久久久久a| 国产精品第1页| 国产精品青草久久久久福利99| 国产成人一区二区三区别| 99免费视频观看| 国产精品一区二区三区免费视频| 麻豆91av| 国内精品**久久毛片app| 秋霞久久久久久一区二区| 亚洲最新免费视频| 欧美成在线视频| 国产精品免费网站| 国产精品无码专区在线观看| 日韩一区二区久久久| 久久国产色av免费观看| 91精品国产高清久久久久久91裸体 | 国产精品久久九九| 久久精品国产成人| 视频直播国产精品| 色妞在线综合亚洲欧美| 久久久久久久国产精品| 久久99久久久久久| 日韩在线精品视频| 久久精品99久久香蕉国产色戒| www.日韩系列| 日韩中文字幕在线免费观看| 久久久久久久久国产精品| 久久国产精品-国产精品| 国产成人精品免高潮在线观看| 久久免费国产精品1| 久久欧美在线电影| 国产成人在线一区二区| 久久av秘一区二区三区| 丝袜亚洲欧美日韩综合| 国产精品视频网| 久久夜精品香蕉| 一区二区三区欧美成人| 亚洲欧洲免费无码| 天堂一区二区三区| 日韩无套无码精品| 欧美在线观看黄| 激情五月亚洲色图| 国产午夜精品一区| 高清视频一区| 国产精国产精品| 日韩一区视频在线| 国产精品欧美风情| 欧美激情一级二级| 午夜精品久久久久久久无码| 日韩国产精品毛片| 蜜桃成人免费视频| 国产精品一区二区久久| 久久天天东北熟女毛茸茸| 日韩在线观看免费| 色综合久久久888| 无码免费一区二区三区免费播放| 日韩精品一区二区在线视频| 加勒比成人在线| 不卡视频一区二区三区| 国产成人精品日本亚洲11| 国产精品免费看久久久无码| 一区二区三区四区欧美日韩| 日韩高清国产一区在线观看 | 久久久中精品2020中文| 久久久av一区| 欧美激情a∨在线视频播放| 色欲色香天天天综合网www| 欧美成ee人免费视频| 97久草视频| 久久久www成人免费精品张筱雨| 久久久久国产精品一区| 日韩亚洲在线视频| 国产欧美亚洲精品| 日韩在线视频播放| 伊人久久青草| 欧美精品一区在线| 97干在线视频| 国产精品高清网站| 日本三级久久久| 国产伦理久久久| 久久精品夜夜夜夜夜久久| 亚洲资源在线看| 国内久久久精品| 久久国产亚洲精品无码| 欧美日韩国产91| 欧美久久久久久久久久久久久久 | 欧美日韩一道本| 97精品伊人久久久大香线蕉| 日韩在线观看高清| 亚洲一区二区三区免费观看| 欧美精彩一区二区三区| 91国产在线播放| 欧美乱妇高清无乱码| 日韩国产欧美精品| 成人免费观看a| 国产精品日韩高清| 日韩视频免费在线播放| 国产精品夜夜夜爽张柏芝| 久久久精品在线观看| 午夜精品一区二区三区在线 | 精品久久中出| 欧美性受xxxx黑人猛交88| 国产精品av免费观看|