国产人妻人伦精品_欧美一区二区三区图_亚洲欧洲久久_日韩美女av在线免费观看

合肥生活安徽新聞合肥交通合肥房產生活服務合肥教育合肥招聘合肥旅游文化藝術合肥美食合肥地圖合肥社保合肥醫院企業服務合肥法律

COMP3230代寫、代做python語言程序

時間:2023-11-13  來源:合肥網hfw.cc  作者:hfw.cc 我要糾錯



COMP**30 Principles of Operating Systems
Programming Assignment Two
Due date: Nov. 19, 2023, at 23:59
Total 11 points
Programming Exercise – Accelerate LLM Inference via Multi-Threading

Objectives
1. An assessment task related to ILO 4 [Prac7cability] – “demonstrate knowledge in applying system
soAware and tools available in the modern opera7ng system for soAware development”.
2. A learning ac7vity related to ILO 2.
3. The goals of this programming exercise are:
§ to have direct prac7ce in designing and developing mul7threading programs;
§ to learn how to use POSIX pthreads (and semaphore) libraries to create, manage, and
coordinate mul7ple threads in a shared memory environment;
§ to design and implement synchroniza7on schemes for mul7threaded processes using
semaphores, or mutex locks and condi7on variables.
Tasks
Optimize the matrix-vector-multiplication algorithm of GPT by multi-threading. Similar to other
neural networks, GPT and its variations utilize matrix-vector-multiplication, or called fullyconnected/linear layer in DL, to apply the parameter learned, which takes >70% of the whole
calculation. Thus, to accelerate the GPT and get faster response, it’s critical to have faster matrixvector-multiplication, and multi-threading are usually considered powerful.
In this assignment, we will use an open-source variation of GPT, llama2 released by Meta, and we
provide a complete pure C implementation of its inference in seq.c as the baseline of your work,
along with model weights. You need to use pthread.h with either the semaphore or (mutex_lock
+ conditional variable) to implement a multi-threading version of matrix-vector-multiplication. This
multi-threading version will significantly accelerate the inference of Large Language Model.
Acknowledgement: This assignment is based on the open-source project llama2.c by Andrej
Karpathy, thanks open-source.
GPT-based    Large    Language    Model
In high-level, GPT is a machine that could generate words one by one based on previous words (also
known as prompts), and Figure 1a illustrate the basic workflow of GPT on generating “How are you”:
Figure 1. GPT Insight. a) GPT generate text one by one, and each output is the input of next generation. b) GPT has four
major component: Tokenizer turns word (string) into vector, Softmax + Sample give next token, and each layer has
Attention and FFN (Feed-Forward Network), consisting of many Matrix-Vector-Multiplication
Figure 1b showcases the inference workflow of each word like “You” in “How are you”: First, words
are transformed into tokens using a tokenizer, which is essentially a (python) dictionary that assigns
a unique vector to each word. The embedding vectors go through multiple layers, each consisting of
three steps.
§ The first step is attention, where the model calculates attention scores based on the cosine
similarity between the current word's query embedding and the embeddings of previous words
(keys). The attention output is a weighted average of the value embeddings, and this process
involves learnable parameters in the form of Matrix-Vector-Multiplication (linear layer).
§ The second step is a feed-forward network (FFN) that adds more learnable parameters through
Matrix-Vector-Multiplication.
§ The third step is positional embedding, which takes into account the ordering of words in natural
language by adding positional information to the attention calculations.
After going through all the layers, the embeddings are classified to generate a specific word as the
output. This involves using a softmax function to convert the embeddings into a probability
distribution, and randomly samples a word from the distribution.
Understanding GPT is not required for this assignment. Just remember that LLM uses a lot of MatrixVector-Multiplication to apply learned parameters to make it powerful.
Task:    Matrix-Vector-Multiplication
Figure 2. Matrix-Vector-Multiplication Algorithm.
As shown in the Figure 2, Matrix-Vector-Multiplication can be illustrated as two iterations:
For Each Row i
For Column j, accumulate Matrix[i][j] * Vector[j] to Out[i]
More specifically, a sample C implementation is shown below (also in seq.c):
void mat_vec_mul(float* out, float* vec, float* mat, int col, int row) {
 for (int i = 0; i < row; i++) {
 float val = 0.0f;
 for (int j = 0; j < col; j++) {
 val += mat[i * col + j] * vec[j]; // mat[i * col + j] := mat[i][j]
 }
 out[i] = val;
 }
}
Your task in this assignment is to parallelize the outer iteration (at the 2nd line) by allocating rows to
threads. More specifically, in the case of a Matrix with   rows and   threads working on the
computation, if   is divisible by  , the k-th thread (  = 0, 1, … ,   − 1) will handle the rows from
!  ×  
 & ' to !(  + 1) ×  
 & − 1'. To illustrate, if we have a 6-row matrix with 2 threads, the 0th
thread will handle rows 0 to 2, while the 1st thread will handle rows 3 to 5. If   is not divisible by  ,
we can assign first   − 1 threads (  = 0, 1, … ,   − 2) with - 
 & . rows, while the last thread handles
remaining rows. More explanation on such design can be found on Appendix a. Parallel Checking.
Moreover, in order to reduce overhead, you are required to create one set of threads and reuse
them for all mat_vec_mul() function calls, instead of creating threads for each
mat_vec_mul()function call. One popular way based on Synchronization is illustrated in Figure 3.
Figure 3. Reference Synchronization Workflow, consisting of 3 function: a) CREATE_MAT_VEC_MUL function: create n
threads, each threads fall asleep immediately; b) MAT_VEC_MUL function: assign new parameters, wake up threads to
work on parameters and wait until threads to finish to return; c) DESTROY_MAT_VEC_MUL function: wake up threads to
collect system usage and exit, wait until threads to exit and collect usage of threads.
More specifically, the synchronization workflow illustrated in Figure 3 consists of 3 functions and the
thread function:
1. create_mat_vec_mul(int thr_count): to be called at the beginning of program, shall:
a. Create n threads
b. Let threads identify themselves, i.e., each thread knows it is the i-th threads
c. Let the created threads fall asleep immediately
2. void mat_vec_mul(float* out, float* vec, float* mat, int col, int row):
API exposed to do Matrix-Vector-Multiplication, shall:
a. Assign new parameters (out, vec, mat, col, row) to threads
b. Wake up threads to do calculation
c. Main thread wait until all threads finished task, and then return
3. destroy_mat_vec_mul(): to be called at the end of program, shall:
a. Wake up threads to collect the system usage (of themselves) and terminates
b. Wait until all threads to exit and collect system usage of threads
c. Collect system usage of main thread, and display both usage of each thread and main thread
d. Clear all resources related with multi-threading, and return
4. void* thr_func(void* arg): Thread function to do Matrix-Vector-Multiplication, shall:
a. Fall asleep immediately after initialization
b. Can be woke up by main thread to work on assigned tasks
c. After finishing the task, inform main thread
d. Being able to collet the system usage (of itself) and terminate
More details and reasons behind the design can be found in Appendix b. Context Design.
Definitely there might have other synchronization workflow, and we are open to your ideas.
However, due to the large class size, we can only accept submissions following the design above.
Specifications
a.    Preparing    Environment
Download the start_code.zip from course’s Moodle – including sequential version seq.c, along
with utility functions in utilities.c and utilities.h. Compile the seq.c with gcc:
gcc -o seq seq.c utilities.c -O2 -lm
Please include utilities.c, use -lm flag to link math library and -O2 flag to apply level-2 optimization.
Please stick to -O2 and don’t use other optimization for fairness. You don’t need to understand and
are not allowed to modify utilities.c and utilities.h.
Download the model files. There are two files required, model.bin for model weight and
tokenizer.bin for tokenizer. Please use following instructions to download them:
wget -O model.bin
https://huggingface.co/huangs0/llama2.c/resolve/main/model.bin
wget -O tokenizer.bin
https://huggingface.co/huangs0/llama2.c/resolve/main/tokenizer.bin
Run the compiled program by giving an integer as the random seed for sampling.
./seq <seed>
Upon invocation, the program will configure the random seed and begin sentence generation
starting from a special <START> token. The program call transformer function to generate the
next token, and printf with fflush to print the generated word to shell immediately. A pair of
utility time measurement function time_in_ms will measure the time in millisecond accuracy:
long start = time_in_ms(); // measure time in ms accuracy
int next, token = 1, pos = 0; // token = 1 -> <START>
while (pos < config.seq_len) { // not exceed max length
 next = transformer(token, pos, &config, &state, &weights); // generate next
 printf("%s", vocab[next]); fflush(stdout); // force print
 token = next; pos++; // record token and shift position
}
long end = time_in_ms(); // measure time in ms accuracy
This program will start generating tiny stories. Finally, when generation is finished, the length of the
generated text, total time, average speed, and system usage will be printed such as:
One day, a little girl named Lucy
......
Carrying a brightly stepped for one dog ladybuging once she had
length: 256, time: 4.400000 s, achieved tok/s: 58.181818
main thread - user: 4.3881 s, system: 0.0599 s
By fixing the same machine (workbench2) and the same random seed, generated text can be
exactly replicated. For example, the above sample is conducted on workbench2 with random seed
42. Moreover, achieved tok/s represents the average number of tokens generated within a
second, and we use it as the metric for speed measurement. Due to the fluctuating system load from
time to time, the speed of the generation will fluctuates around some level.
b.    Implement    the    parallel    Matrix-Vector-Multiplication    by    multi-threading
Open the llama2_[UID].c, rename [UID] with your UID, and implement the workflow
illustrated in Figure. 3 by completing the four functions and adding appropriate global variables. For
synchronization, please use either semaphore or (mutex locks and conditional variables). You can
only modify code between specified // YOUR CODE STARTS HERE at line 43 and // YOUR
CODE ENDS HERE at line 67 in llama2_[UID].c.
Here are some suggestions for the implementation:
1. How to assign new tasks and inform them to terminate? Noted that all threads can access global
variables so you can updates the global variables and wake them up.
2. Main thread shall wait for threads to work or terminates.
3. For collecting system usage, please consider getrusage.
Your implementation shall be able to be compiled by the following command:
gcc -o llama2_[UID] llama2_[UID].c utilities.c -O2 -pthread -lm
Then run the compiled program. Now it accepts two arguments seed and thr_count . Code
related to reading arguments has been provided in llama2_[UID].c. You can use thr_count
to specify the number of threads to use.
./llama2_[UID] <seed> <thr_count>
If your implementation is correct, under the same random seed, generated text shall be the same
as sequential version, but the generation will be faster. Moreover, you shall report the system
usage for each threads respectively. For example, this is the output of random seed 42 on
workbench2 with 4 threads:
One day, a little girl named Lucy
......
Carrying a brightly stepped for one dog ladybuging once she had
length: 256, time: 2.100000 s, achieved tok/s: 121.****62
Thread 0 has completed - user: 1.2769 s, system: 0.0363 s
Thread 1 has completed - user: 1.2658 s, system: 0.0361 s
Thread 2 has completed - user: 1.2749 s, system: 0.0277 s
Thread 3 has completed - user: 1.2663 s, system: 0.0**3 s
main thread - user: 5.7126 s, system: 0.**9 s
c. Measure    the    performance    and    report    your    finding
Benchmark your implementation (tok/s) on your own computer with different thread numbers and
report metrics like the following table:
Thread Numbers Speed (tok/s) User Time System Time Use Time/System Time
0 (Sequential)
1 (1 (child)Thread)
2
4
6
8
10
12
16
Regarding system usage (user time / system time), please report the usage of the whole process
instead of each thread. Then based on above table, try to briefly analyze relation between
performance and No. threads and reason the relationship. Submit the table, your analysis and
reasoning in an one-page pdf document.
IMPORTANT: Due to the large number of students this year, please conduct the benchmark on your
own computer instead of the workbench2 server. Grading of your report is based on your analysis
and reasoning instead of the speed you achieved. When you’re working on workbench2, please be
reminded that you have limited maximum allowed thread numbers (128) and process (512), so
please do not conduct benchmarking on workbench2 server.
Submission
Submit your program to the Programming # 2 submission page at the course’s moodle website.
Name the program to llama2_[UID].c (replace [UID] with your HKU student number). As the
Moodle site may not accept source code submission, you can compress files to the zip format before
uploading. Submission checklist:
§ Your source code llama2_[UID].c, must be self-contained. (No dependencies other
than utilities.c and utilities.h)
§ Your report including benchmark table, your analysis and reasoning
§ Please do not submit model and tokenizer binary file (model.bin and tokenizer.bin).
Documentation
1. At the head of the submitted source code, i.e., llama2_[UID].c, state the:
§ File name
§ Student’s Name and UID
§ Development Platform
§ Remark – describe how much you have completed (See Grading Criteria)
2. Inline comments (try to be detailed so that your code could be understood by others easily)
Computer    Platform    to    Use
For this assignment, you can develop and test your program on any Linux/Mac platform, but you
must make sure that the program can correctly execute on the workbench2 Linux server (as the
tutors will use this platform to do the grading). Your program must be written in C and successfully
compiled with gcc on the server.
Please note that the only server for COMP**30 is workbench2.cs.hku.hk, and please do not use
any other CS department server, especially academy11 and academy21, as they are reserved for
other courses. In case you can not login to workbench2, please contact tutor(s) for help.
Grading    Criteria
1. Your submission will be primarily tested on the workbench2 server. Make sure that your
program can be compiled without any errors. Otherwise, we have no way to test your
submission and you will get a zero mark.
2. As the tutor will check your source code, please write your program with good readability (i.e.,
with good code convention and sufficient comments) so that you will not lose marks due to
confusion.
3. You can only use pthread.h and semaphore.h(if need), using other external libraries like
OpenMP, LAPACK will lead to 0 mark.
Detailed    Grading    Criteria
§ Documentation -1 point if failed to do
§ Include necessary documentation to explain the logic of the program
§ Include required student’s info at the beginning of the program
§ Report: 1 point
§ Measure the performance of the sequential program and your parallel program on your
computer with various No. threads (0, 1, 2, 4, 6, 8, 10, 12, 16).
§ Briefly analyze the relation between performance and No. threads and reason the relation
§ Implementation: 10 points evaluated progressively:
1. (+2 points = 2 points) Achieve correct result & use multi-threading. Correct means
generated text of multi-threading and sequential are identical with same random seed.
2. (+3 points = 5 points total) All in 1., and achieve >10% acceleration by multi-threading
compared with sequential under 4 threads. Acceleration measurement is based on tok/s,
acceleration must result from multi-threading instead of others like compiler (-O3), etc.
3. (+5 points = 10 points total) All in 2., and reuse threads in multi-threading. Reuse threads
means number of threads created in the whole program must be constant as thr_count.
Plagiarism
Plagiarism is a very serious offense. Students should understand what constitutes plagiarism, the
consequences of committing an offense of plagiarism, and how to avoid it. Please note that we may
request you to explain to us how your program is functioning as well as we may also make use of
software tools to detect software plagiarism.
Note: You must clearly acknowledge it if you use ChatGPT or any AI tools to generate code in your
implementation. Please kindly quote GPT generated code by // GPT Code Start Here and
// GPT Code End Here.
Appendix
a. Parallelism    Checking
To parallel by multi-threading, it’s critical to verify if the computation is independent to avoid race
condition and the potential use of lock. More specifically, we need to pay special attention to check
and avoid writing to the same memory location while persisting the correctness.
For example, 1st iteration (outer for-loop) matches the requirement of independence as the
computation of each row won’t affect others, and the only two writing is out[i] and val. Writing to
the same out[i] can be avoid by separating i between threads. val can be implemented as stack
variables for each threads respectively so no writing to the same memory.
Quite the opposite, 2nd iteration (inner for-loop) is not a good example for multi-threading, though
the only writing is val. If val is implemented as stack variable, then each thread only holds a part of
correct answer. If val is implemented as heap variables to be shared among threads, then val
requires lock to avoid race writing.
b. Design    of    Context
A straightforward solution to the above problem is to let thread function to do computation and exit
when finished, and let original mat_vec_mul function to create threads and wait for threads exit by
pthread_join. This could provide the same synchronization.
However, this implementation is problematic because each function call to mat_vec_mul will create
n new threads. Unfortunately, to generate a sentence, LLM like llama2 will call mat_vec_mul
thousands of times, so thousands of threads will be created and destroyed, which leads to indefinite
overhead to the operation system.
Noted that all the calls to mat_vec_mul are doing the same task, i.e., Matrix-Vector-Multiplication,
and the only difference between each function call is the parameter. Thus, a straightforward
optimization is to reuse the threads. In high-level, we can create n threads in advance, and when
mat_vec_mul is called, we assign new parameters for thread functions and let threads working on
new parameters.
Moreover, It’s worth noticed that mat_vec_mul is only valid within the context, i.e., between
create_mat_vec_mul and destroy_mat_vec_mul, or there are no threads other than the main (not
yet created or has been destroyed). This kind of context provides efficient and robust control over
local variable, and has been integrated with high-level languages like Python `with`.
請加QQ:99515681 或郵箱:99515681@qq.com   WX:codehelp

掃一掃在手機打開當前頁
  • 上一篇:代做CEG3136、代寫C/C++程序語言
  • 下一篇:EECS 2101代寫、代做java編程設計
  • 無相關信息
    合肥生活資訊

    合肥圖文信息
    流體仿真外包多少錢_專業CFD分析代做_友商科技CAE仿真
    流體仿真外包多少錢_專業CFD分析代做_友商科
    CAE仿真分析代做公司 CFD流體仿真服務 管路流場仿真外包
    CAE仿真分析代做公司 CFD流體仿真服務 管路
    流體CFD仿真分析_代做咨詢服務_Fluent 仿真技術服務
    流體CFD仿真分析_代做咨詢服務_Fluent 仿真
    結構仿真分析服務_CAE代做咨詢外包_剛強度疲勞振動
    結構仿真分析服務_CAE代做咨詢外包_剛強度疲
    流體cfd仿真分析服務 7類仿真分析代做服務40個行業
    流體cfd仿真分析服務 7類仿真分析代做服務4
    超全面的拼多多電商運營技巧,多多開團助手,多多出評軟件徽y1698861
    超全面的拼多多電商運營技巧,多多開團助手
    CAE有限元仿真分析團隊,2026仿真代做咨詢服務平臺
    CAE有限元仿真分析團隊,2026仿真代做咨詢服
    釘釘簽到打卡位置修改神器,2026怎么修改定位在范圍內
    釘釘簽到打卡位置修改神器,2026怎么修改定
  • 短信驗證碼 寵物飼養 十大衛浴品牌排行 suno 豆包網頁版入口 目錄網 排行網

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 hfw.cc Inc. All Rights Reserved. 合肥網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    国产人妻人伦精品_欧美一区二区三区图_亚洲欧洲久久_日韩美女av在线免费观看
    欧美日韩成人网| 99re在线视频上| 日韩精品一区二区三区久久| 日本午夜精品一区二区| 青青在线免费视频| 风间由美久久久| 国产精品日韩av| 日本精品免费| 黄色影院一级片| 成人国产亚洲精品a区天堂华泰| 久久国产精品-国产精品| 国产99久久精品一区二区永久免费| 日韩不卡一二区| 99精品一级欧美片免费播放| 国产精品极品尤物在线观看| 日韩精品一区二区三区四 | 国产高清精品一区二区| 亚洲在线播放电影| av在线播放亚洲| 亚洲日本欧美在线| 成人av蜜桃| 国产999在线观看| 国产欧美日韩中文| 欧美激情视频在线观看| 国产午夜精品一区| 国产夫妻自拍一区| 日韩av影视| 国产v片免费观看| 日本精品免费观看| 在线视频不卡一区二区| 99久久激情视频| 久久久久久久久影视| 久久精品国产精品亚洲| 人妻无码久久一区二区三区免费| 国产mv久久久| 国产精品美乳一区二区免费| 欧美在线欧美在线| 久久精品美女视频网站| 国内偷自视频区视频综合| 久久亚洲欧美日韩精品专区| 精品无人区一区二区三区竹菊| 国产精品区一区二区三在线播放| 精品一区二区三区毛片| 九九九热精品免费视频观看网站| 高清视频一区| 婷婷久久青草热一区二区| 久久久人人爽| 日韩欧美在线免费观看视频| 色视频www在线播放国产成人| 欧美性视频在线| 不卡av电影在线观看| 国产中文日韩欧美| 最新中文字幕久久| 国产极品美女高潮无套久久久| 日本一区视频在线| 国产精品欧美一区二区三区奶水| 国产视频不卡| 亚洲成色www久久网站| 久久er99热精品一区二区三区| 欧美少妇一级片| 中文字幕日韩精品一区二区 | 久久爱av电影| 欧美二区在线| 久久69精品久久久久久久电影好| 97国产精品视频| 日韩国产精品一区二区| 国产精品三区www17con| 国产精品自产拍高潮在线观看| 欧美一区二区三区四区夜夜大片 | 日韩av高清在线播放| 精品久久久91| 国产伦精品一区二区三区免| 日本精品中文字幕| 九九精品在线播放| 久久精品国产99精品国产亚洲性色| 蜜桃av噜噜一区二区三区| 亚洲国产精品毛片| 国产爆乳无码一区二区麻豆| 精品少妇人妻av免费久久洗澡| 亚洲乱码一区二区三区三上悠亚 | 久久夜色精品国产| 国产高清不卡av| 国产一区二区在线观看免费播放 | 精品国产免费久久久久久尖叫 | 久久亚洲春色中文字幕| 久无码久无码av无码| 狠狠久久综合婷婷不卡| 日韩在线综合网| 国产精品自拍偷拍视频| 91av福利视频| 日本在线视频不卡| 国产裸体写真av一区二区| 色999日韩自偷自拍美女| 久久精品99国产精品酒店日本| 99视频精品免费| 蜜桃麻豆www久久国产精品| 日本精品久久久| 亚洲色欲久久久综合网东京热| 国产精品无码一本二本三本色| 99精品视频播放| 国产日韩欧美精品在线观看| 欧美精品99久久| 日韩日韩日韩日韩日韩| 亚洲精品视频一区二区三区| 欧美成人第一页| 久久精品福利视频| 久久久久久亚洲精品| 国产精品com| 国产免费xxx| 免费看黄在线看| 欧美日本韩国一区二区三区| 日产国产精品精品a∨| 亚洲高清乱码| 亚洲永久免费观看| 中文字幕av导航| 蜜臀久久99精品久久久久久宅男| 国产成人啪精品视频免费网| 国产成人精品久久亚洲高清不卡| 国产美女精品久久久| 国产综合18久久久久久| 欧美 国产 精品| 欧美牲交a欧美牲交| 欧美日韩大片一区二区三区| 欧美一级片一区| 日本午夜在线亚洲.国产| 日韩av成人在线| 视频在线一区二区三区| 天堂精品一区二区三区| 视频一区二区在线观看| 午夜精品久久久久久久99热| 日韩av一区二区三区在线观看| 日本一道本久久| 欧美又大又粗又长| 狠狠干 狠狠操| 国产日韩一区二区在线观看| 高清视频在线观看一区| 91久久综合亚洲鲁鲁五月天| 97成人在线免费视频| 99久久国产免费免费| 国产极品精品在线观看| 久久riav| 国产精品久久久久久久久久三级 | 免费av一区二区| 久久久久久成人| 亚州精品天堂中文字幕| 少妇大叫太大太粗太爽了a片小说| 日本三级中国三级99人妇网站| 日韩精品手机在线观看| 日本wwwcom| 韩国三级日本三级少妇99| 国产一区二区三区奇米久涩| 国产精品一区二区不卡视频| 91高清免费视频| 日韩在线中文字| 国产精品久久久久免费a∨大胸| 精品久久久久久一区二区里番 | 国产www精品| 国产精品视频免费在线观看| 国产精品九九久久久久久久| 久久夜色撩人精品| 中文字幕精品一区日韩| 手机看片福利永久国产日韩| 欧美一区深夜视频| 国产熟女高潮视频| 久久亚洲一区二区| 国产精品日韩av| 亚洲伊人婷婷| 欧美日韩在线播放一区二区| 国产日韩欧美大片| 久久噜噜噜精品国产亚洲综合| 久久久av一区| 中文字幕中文字幕在线中心一区| 色999五月色| 国内精品久久影院| 91精品国产91久久久久久| 国产成人一区二区三区别| 国产精品视频永久免费播放| 在线观看亚洲视频啊啊啊啊| 日本精品视频在线观看| 国产亚洲天堂网| 久久久久久久久久久99| 欧美xxxx14xxxxx性爽| 日本韩国在线不卡| 国产精品一码二码三码在线| 九九九久久久| 一本二本三本亚洲码| 欧美精品自拍视频| 97国产精品人人爽人人做| 国产精品区免费视频| 天天综合色天天综合色hd| 欧美日韩亚洲一区二区三区四区 | 免费一级特黄特色毛片久久看| 91精品久久久久久久久久| 国产精品麻豆va在线播放| 日韩av电影免费在线| 国产麻花豆剧传媒精品mv在线| 日韩中文字幕网址| 天天人人精品| 国产欧美精品aaaaaa片| 国产不卡一区二区在线观看|