(pandas) 분석용 youtube 채널 데이터(DataFrame) 만들기

data science/pandas

(pandas) 분석용 youtube 채널 데이터(DataFrame) 만들기

꼰대코더 2023. 11. 18. 19:39

준비물

Youtube API Key (free)	취득방법은 아래 사이트의 중간부분 참조 https://blog.hubspot.com/website/how-to-get-youtube-api-key
모듈 인스톨	pip3 install pandas pip3 install google-api-python-client
분석할 채널 ID	Youtube채널 사이트에 들어가서 페이지 소스 보기 channelId 로 검색 예) 노빠꾸탁재훈 https://www.youtube.com/@nobacktak channelId : UCSSkHIU1-nL_FeCjeZ_Xtvg

Python Code

from googleapiclient.discovery import build
import os
import pandas as pd

API_KEY = "본인 Youtube API KEY"
# build
youtube = build('youtube', 'v3', developerKey=API_KEY)

# 분석할 채널 아이디 (노빠꾸탁재훈)
CHANNEL_ID = "UCSSkHIU1-nL_FeCjeZ_Xtvg"

# 채널의 통계 취득 함수 (나중에 비디오 리스트를 취득하기 위한 upload id 가 포함)
def get_channel_stats(youtube, channel_id):
    request = youtube.channels().list(
        part="snippet,contentDetails,statistics",
        id=channel_id
    )
    response = request.execute()
    
    return response['items']
   
 # upload id 에 포함되어 있는 모드 비디오 리스트를 취득
 # 부하를 줄이기 위해 한번에 50개씩 가져옴
 def get_video_list(youtube, upload_id):
    video_list = []
    request = youtube.playlistItems().list(
        part="snippet,contentDetails",
        playlistId=upload_id,
        maxResults=50
    )
    next_page = True
    while next_page:
        response = request.execute()
        data = response['items']

        for video in data:
            video_id = video['contentDetails']['videoId']
            if video_id not in video_list:
                video_list.append(video_id)

        # 아직 페이지가 남았나?
        if 'nextPageToken' in response.keys():
            next_page = True
            request = youtube.playlistItems().list(
                part="snippet,contentDetails",
                playlistId=upload_id,
                pageToken=response['nextPageToken'],
                maxResults=50
            )
        else:
            next_page = False

    return video_list
  
  # 각 비디오들의 프로퍼티를 취득
  # 부하를 줄이기 위해 50 비디오씩 취급
  # 나중에 Padas에서 읽어들이기 위해 python의 dictionary 형식으로 조합
  def get_video_details(youtube, video_list):
    stats_list=[]

    for i in range(0, len(video_list), 50):
        request= youtube.videos().list(
            part="snippet,contentDetails,statistics",
            id=video_list[i:i+50]
        )

        data = request.execute()
        for video in data['items']:
            title=video['snippet']['title']
            published=video['snippet']['publishedAt']
            description=video['snippet']['description']
            # 어떤 비디오에서는 tag 정보가 없는 경우가 있기 때문에 에러 방지를 위해 체크
            if 'tags' in video['snippet']:
            	tag_count= len(video['snippet']['tags'])
            else:
            	tag_count = 0
            view_count=video['statistics'].get('viewCount',0)
            like_count=video['statistics'].get('likeCount',0)
            dislike_count=video['statistics'].get('dislikeCount',0)
            comment_count=video['statistics'].get('commentCount',0)
            stats_dict=dict(title=title, description=description, published=published, tag_count=tag_count, view_count=view_count, like_count=like_count, dislike_count=dislike_count, comment_count=comment_count)
            stats_list.append(stats_dict)

    return stats_list
    
 
# 채널정보
channel_stats = get_channel_stats(youtube, CHANNEL_ID)
# upload id : 비디오들이 id 와 연결되어 있슴
upload_id = channel_stats[0]['contentDetails']['relatedPlaylists']['uploads']
# 비디오 리스트
video_list = get_video_list(youtube, upload_id)
# 각 비디오별 프로퍼티(dictionary)
video_data = get_video_details(youtube, video_list)

# padas 로 변환
df=pd.DataFrame(video_data)

# 스트링형(object)을 넘버형으로 변환
df = df.astype({'view_count':'int', 'like_count':'int', 'dislike_count':'int', 'comment_count':'int'})
# 혹은 개별처리 가능
# df['view_count'] = df['view_count'].astype('int')
# 혹은 to_numeric 메서드를 이용 가능
# df["view_count"] = pd.to_numeric(df["view_count"])

# 새로운 칼럼(리액션= 카운트의 총합)을 생성
df["reactions"] = df["like_count"] + df["dislike_count"] + df["comment_count"] + df["comment_count"]

# csv 로 저장
df.to_csv("노빠꾸탁재훈.csv")

다운로드 노빠꾸탁재훈.csv

'data science > pandas' 카테고리의 다른 글

Sales data 분석 (0)	2025.03.12
복수의 DataFrame들을 수직방향으로 통합하기 (0)	2025.03.12
두개의 Series를 하나의 DataFrame으로 통합 (0)	2025.03.12
(pandas) basic (0)	2024.10.23
(pandas) Youtube 노빠꾸탁재훈 채널 분석 (0)	2023.11.18

현재글(pandas) 분석용 youtube 채널 데이터(DataFrame) 만들기

꼰대코더

50대 c/c++ .net reactjs flutter deep learning 프로그래머

docker-compose, OpenCV, pandas, PDF, ㅜ, Docker, word2vec, dockerfile, ECG, react #useEffect,

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

꼰대코더

(pandas) 분석용 youtube 채널 데이터(DataFrame) 만들기

'data science > pandas' 카테고리의 다른 글

'data science/pandas'의 다른글

티스토리툴바

(pandas) 분석용 youtube 채널 데이터(DataFrame) 만들기

'data science > pandas' 카테고리의 다른 글

'data science/pandas'의 다른글

관련글

티스토리툴바