'python' 카테고리의 글 목록

'python'에 해당되는 글 32건

2021.04.18 Series와 DataFrame
2021.04.18 conda package 설치 및 삭제
2021.04.17 pip 속도가 느릴 때
2021.04.17 conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법
2020.01.29 변환, 정렬, packing unpacking 간단 예제
2020.01.22 코루틴 예제
2020.01.20 클래스 변수, 인스턴스 변수
2020.01.16 형변환 연습
2020.01.02 python 가상환경 설정
2019.12.27 크롤링 연습 11. google sheet에 데이터 저장

Series와 DataFrame

Series와 DataFrame을 사용을 위해 pandas 를 설치할 필요가 있다.

conda를 사용하는 경우, 가상환경 activate 시킨 후, 모듈을 설치하면된다.

>conda activate virtual_env

>conda install python=3.8 pandas //numpy 모듈이 python 버전에 dependancy가 있으므로 유의할 것

먼저 모듈을 불러와 준다.

from pandas import Series, DataFrame

Series는 인덱스를 갖는 List라고 볼 수 있다.
Ex)
mine = Series([1,2,3],index=['tech','bio','enter'])
print(mine)

tech     1
bio      2
enter    3
dtype: int64

DataFrame는 여러개의 Series로 구성된 자료구조이다.
기본 특징으로는
1, 생성 방법에는 dict가 사용됨

daeshin = {'open':  [11650, 11100, 11200, 11100, 11000],
           'high':  [12100, 11800, 11200, 11100, 11150],
           'low' :  [11600, 11050, 10900, 10950, 10900],
           'close': [11900, 11600, 11000, 11100, 11050]}

2, columns의 list를 넘겨주어 columns의 순서를 지정 가능
Ex1)

daeshin_day = DataFrame(daeshin ,
columns=['close','high','low','open'])

   close   high    low   open
0  11900  12100  11600  11650
1  11600  11800  11050  11100
2  11000  11200  10900  11200
3  11100  11100  10950  11100
4  11050  11150  10900  11000

3, index에 list를 넘겨줌으로써 default index를 원하는 형태로 변경 가능하다.
Ex2)
daeshin_day = DataFrame(daeshin ,
columns=['close','high','low','open']
                        ,index=['01-01','01-02','01-03','01-04','01-05'])

       close   high    low   open
01-01  11900  12100  11600  11650
01-02  11600  11800  11050  11100
01-03  11000  11200  10900  11200
01-04  11100  11100  10950  11100
01-05  11050  11150  10900  11000

각 column(Series)에 대한 참조는 dict와 유사함.

print(max(daeshin_day['open']))

01-01    11650
01-02    11100
01-03    11200
01-04    11100
01-05    11000

행의 값을 읽어 오는 방법은 loc 메서드를 사용하면 된다.

print(daeshin_day.loc['01-01'])

close    11900
high     12100
low      11600
open     11650
Name: 01-01, dtype: int64

print(type(daeshin_day.loc['01-01']))// type은 Series

결론 :

row 를 읽으려면 loc 메서드 사용

colume을 읽으려면 dict와 같이 column name을 인덱싱.

참고: 행렬의 요소를 읽어오는 방법.

print(daeshin_day.index)
print(daeshin_day.columns)

Index(['01-01', '01-02', '01-03', '01-04', '01-05'], dtype='object')
Index(['close', 'high', 'low', 'open'], dtype='object')

index를 이용한 DataFrame의 출력

for row in daeshin_day.index:
print(daeshin_day.loc[row])

'python' 카테고리의 다른 글

conda package 설치 및 삭제 (0)	2021.04.18
pip 속도가 느릴 때 (0)	2021.04.17
conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법 (0)	2021.04.17
변환, 정렬, packing unpacking 간단 예제 (0)	2020.01.29
코루틴 예제 (0)	2020.01.22

Posted by easy16

conda package 설치 및 삭제

ex) virtual env를 test, 설치할 package를 pandas로 가정하면,

>conda activate test

//설치

>conda install pandas

//삭제

>conda uninstall pandas

//설치된 package 리스트 확인

>conda list

'python' 카테고리의 다른 글

Series와 DataFrame (0)	2021.04.18
pip 속도가 느릴 때 (0)	2021.04.17
conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법 (0)	2021.04.17
변환, 정렬, packing unpacking 간단 예제 (0)	2020.01.29
코루틴 예제 (0)	2020.01.22

Posted by easy16

pip 속도가 느릴 때

mirror 서버 변경으로 해결 가능

아래의 경로에 pip.ini 파일을 하기의 내용으로 작성 후 재시도 해볼 것.

C:\Users\사용자이름\AppData\Roaming\pip\pip.ini

[list]
format=columns

[global]
index-url=http://ftp.daumkakao.com/pypi/simple

출처 : blog.naver.com/PostView.nhn?blogId=kangsho15&logNo=221901004296&categoryNo=0&parentCategoryNo=0&viewDate=&currentPage=1&postListTopCurrentPage=1&from=postView

'python' 카테고리의 다른 글

Series와 DataFrame (0)	2021.04.18
conda package 설치 및 삭제 (0)	2021.04.18
conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법 (0)	2021.04.17
변환, 정렬, packing unpacking 간단 예제 (0)	2020.01.29
코루틴 예제 (0)	2020.01.22

Posted by easy16

conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법

#1, conda 가상환경 만들기

conda create -n <환경명> python=<버전> <사용할 라이브러리 지정>
ex)
>conda create -n virtual_env python=3.7 numpy matplotlib

Tip)가상환경은 하기 경로에 설치됨.
"c:\users\사용자계정\anaconda3\env\virtual_env" 경로에 설치됨.

#2, 가상환경이 생성되었는지 확인
>conda env list

Tip) 프롬프트 상에서 가상환경 설정 및 해제 방법 (skip)
>conda activate virtual_env
//쉘이 변경됨.
(virtual_env) >
//해제
>conda deactivate

#3, 이후 Pycharm 프로젝트 환경설정에서 interpreter를 생성한 가상환경으로 설정해주면 끝

출처 : sdc-james.gitbook.io/onebook/2./2.1./2.1.1./2-conda-virtual-environments

'python' 카테고리의 다른 글

conda package 설치 및 삭제 (0)	2021.04.18
pip 속도가 느릴 때 (0)	2021.04.17
변환, 정렬, packing unpacking 간단 예제 (0)	2020.01.29
코루틴 예제 (0)	2020.01.22
클래스 변수, 인스턴스 변수 (0)	2020.01.20

Posted by easy16

변환, 정렬, packing unpacking 간단 예제

#진법 변환 예
print(bin(16)[2:],oct(16)[2:],hex(16)[2:])

#정렬
x = [5,6,7,1,2,3]
print(sorted(x,reverse=True))
print(x)
print(list(reversed(x)))

print(max(x),min(x),sum(x))

print(list(zip(['1','2','3'],['a','b','c'])))

#람다를 활용한 map
a=[1,2,3,1,4,5]
b=['a','b','c']
c=['d','e','f']

print(list(map(lambda x : x**2 , a)))

for i,b in enumerate(reversed(range(10))):
print(i,b)

#packing unpacking : 모양을 유심히 보도록
x=[(1,2),(3,4), (5,6)]
for i in x:
    print(i)

for i,j in x:
    print(i,j)


x=[(1,2,(10,20)),(3,4,(30,40)), (5,6,(50,60))]

for i,j,(k,l) in x:
    print(i,j,k,l)

'python' 카테고리의 다른 글

pip 속도가 느릴 때 (0)	2021.04.17
conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법 (0)	2021.04.17
코루틴 예제 (0)	2020.01.22
클래스 변수, 인스턴스 변수 (0)	2020.01.20
형변환 연습 (0)	2020.01.16

Posted by easy16

코루틴 예제

def num_coroutine():
    total = 0
    while True:
        print('before yield')
        x = (yield total)
        total += x
        print('after yield')
        print(x)
    except GeneratorExit:
        print('close coroutine')

co = num_coroutine()
print('total',next(co))
print('total', co.send(int(input('put int:'))))
print('total', co.send(int(input('put int:'))))
print('total', co.send(int(input('put int:'))))

co.close()

출처 : https://dojang.io/mod/page/view.php?id=2420

'python' 카테고리의 다른 글

conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법 (0)	2021.04.17
변환, 정렬, packing unpacking 간단 예제 (0)	2020.01.29
클래스 변수, 인스턴스 변수 (0)	2020.01.20
형변환 연습 (0)	2020.01.16
python 가상환경 설정 (0)	2020.01.02

Posted by easy16

클래스 변수, 인스턴스 변수

http://schoolofweb.net/blog/posts/%ED%8C%8C%EC%9D%B4%EC%8D%AC-oop-part-3-%ED%81%B4%EB%9E%98%EC%8A%A4-%EB%B3%80%EC%88%98class-variable/

SchoolofWeb :: 파이썬 - OOP Part 3. 클래스 변수(Class Variable)

파이썬 객체 지향 프로그래밍(Object Oriented Programming) 강좌 - Part 3. 클래스 변수(Class Variable)

schoolofweb.net

class A(object):
    a = 1.0



a1 = A()
a2 = A()
#a1.a = 2.0 # 인스턴스 변수 활용
A.a = 2.0 #클래스 변수 활용
print(a1.a, a2.a)class A(object):
    a = 1.0



a1 = A()
a2 = A()
a1.b = 2.0
A.a = 2.0
print(a1.a, a2.a)
print('a1 : ', a1.__dict__)
print('a2 : ', a2.__dict__)
print(A.__dict__)

'python' 카테고리의 다른 글

변환, 정렬, packing unpacking 간단 예제 (0)	2020.01.29
코루틴 예제 (0)	2020.01.22
형변환 연습 (0)	2020.01.16
python 가상환경 설정 (0)	2020.01.02
크롤링 연습 11. google sheet에 데이터 저장 (0)	2019.12.27

Posted by easy16

형변환 연습


#형변환 연습

x = 10 
y = '10'


print(chr(65))
print(chr(65+25), ord('Z'))
print(chr(97))

print(ord('가'))

print(bin(16))
#str 리턴 되므로 슬라이싱을 통해 숫자 부분만 리턴 가능
print(bin(16)[2:])
#응용
print(bin(16)[2:].replace('1','#').replace('0','!'))

print(oct(16))
print(hex(16))

print(hex(id(y)))
print(type(hex(id(y))))


x = 0b1101
y = 0o15
z = 0xd

if x == y :
    print('{} is same with {}'.format(x,y))

if x == z :
    print('{} is same with {}'.format(x,z))

    

#same as False
print(bool([]))
print(bool({}))
print(bool(()))
print(bool(0))
print(bool(0.0))
print(bool(''))

#이외의 어떤값이 있다면 True

print(bool(1))
print(bool(-1))
print(bool(' '))
print(bool('a'))


a= [True,False,False] 
b= [True,True,True] 
c= [False,False,False] 
#all 모든 값이 True인 경우 True return
print(all(a))
print(all(b))
print(all(c))

#any 

print(any(a))
print(any(b))
print(any(c))

d = ['', 0, 0.0]
print(all(d))
print(any(d))

"""
A
Z 90
a
44032
0b10000
10000
#!!!!
0o20
0x10
0x226a521f170

13 is same with 13
13 is same with 13
False
False
False
False
False
False
True
True
True
True
False
True
False
True
True
False
False
True
"""

#list, tuple, set ,dict


name = 'jayce'
print(name)
print(tuple(name))
print(set(name))
#print(dict(name)) 불가


#숫자 카운팅
#10000안에 포함된 8의 갯수 세기
str(list(range(10000))).count('2')


#dict

a = dict(one=1,two=2,three=3)
b = {'one':1,'two':2,'three':3}
#리스트로 구성된 key와 value를 dict로 합치는 방법
c = dict(zip(['one','two','three'], [1,2,3]))
d = dict([('two',2), ('one',1), ('three', 3)]) 
e = dict({'one':1,'two':2,'three':3})


#응용
new_dict=dict(zip(a.keys(),a.values()))
print(new_dict)

'python' 카테고리의 다른 글

코루틴 예제 (0)	2020.01.22
클래스 변수, 인스턴스 변수 (0)	2020.01.20
python 가상환경 설정 (0)	2020.01.02
크롤링 연습 11. google sheet에 데이터 저장 (0)	2019.12.27
정규표현식 연습 2. 반복 (0)	2019.12.26

Posted by easy16

python 가상환경 설정

#가상환경 설정
$pip3 install virtualenv
$virtualenv venv

root@goorm:/workspace/pyContainer/instaclone# ls
venv

#source venv/bin/activate
$pip3 list
$pip3 install django==2.1
$django-admin startproject config .

$python manage.py migrate

(venv) root@goorm:# python manage.py runserver 0:80

'python' 카테고리의 다른 글

클래스 변수, 인스턴스 변수 (0)	2020.01.20
형변환 연습 (0)	2020.01.16
크롤링 연습 11. google sheet에 데이터 저장 (0)	2019.12.27
정규표현식 연습 2. 반복 (0)	2019.12.26
정규표현식 연습 1. (0)	2019.12.26

Posted by easy16

크롤링 연습 11. google sheet에 데이터 저장

#tips
#속성 값을 dict 형태로 가져 올 수 있음.
#크롤링으로 가져온 링크에 또 다시 크롤링하여 디테일한 데이터를 가져올 수 있다.
#참고 : select_one을 사용할 경우, 리스트가 아닌 selector에 해당하는 단 한개의 데이터만 가져올 수 있다.


import requests
from bs4 import BeautifulSoup
import openpyxl



excel_file = openpyxl.load_workbook
excel_file = openpyxl.Workbook()
#기본 sheet 삭제 및 새로운 sheet 생성
#excel_file.remove(excel_file['Sheet'])
excel_file.remove(excel_file.active)
excel_sheet = excel_file.create_sheet('best 100 items in fasion')

excel_sheet.column_dimensions['A'].width = 10
excel_sheet.column_dimensions['B'].width = 100
excel_sheet.column_dimensions['C'].width = 80
excel_sheet.column_dimensions['D'].width = 80
excel_sheet.column_dimensions['E'].width = 80
excel_sheet.column_dimensions['F'].width = 80


excel_sheet.append(['rank','name','company','ceo','tel','link'])
#셀 정렬
cell_A1 = excel_sheet['A1']
cell_A1.alignment = openpyxl.styles.Alignment(horizontal='center') 

cell_A1 = excel_sheet['B1']
cell_A1.alignment = openpyxl.styles.Alignment(horizontal='center') 

cell_A1 = excel_sheet['C1']
cell_A1.alignment = openpyxl.styles.Alignment(horizontal='center') 

cell_A1 = excel_sheet['D1']
cell_A1.alignment = openpyxl.styles.Alignment(horizontal='center') 

cell_A1 = excel_sheet['E1']
cell_A1.alignment = openpyxl.styles.Alignment(horizontal='center') 

cell_A1 = excel_sheet['F1']
cell_A1.alignment = openpyxl.styles.Alignment(horizontal='center') 

site = 'http://corners.gmarket.co.kr/Bestsellers?viewType=G&groupCode=G06'
res = requests.get(site)

soup = BeautifulSoup(res.content, 'html.parser')
##gBestWrap > div > div:nth-child(6) > div:nth-child(3) > ul > li:nth-child(1) > a
#selector = '#gBestWrap > div > div:nth-child(6) > div > ul > li > a'
selector = 'div.best-list > ul > li'
data= soup.select(selector)


for item in data:
    
    rank = item.find('p').get_text()
    name = item.find('a','itemname').get_text()
    #태그의 속성은 dict처럼 접근 가능!
    link = item.find('a','itemname')['href']
    if rank != 'PLUS':
        
        #print(rank,name,link)        
        res_sub = requests.get(link)
        soup_sub = BeautifulSoup(res_sub.content, 'html.parser')
        selector_sub = 'dl.exchange_data.seller_data > dd'
        data_sub = soup_sub.select(selector_sub)
        #참고 : select_one을 사용할 경우, 리스트가 아닌 selector에 해당하는 단 한개의 데이터만 가져올 수 있다.
        #data_sub = soup_sub.select_one(selector_sub)
      
        vendor_name = data_sub[0].get_text()
        ceo_name = data_sub[1].get_text()
        phone_number = data_sub[2].get_text()
        
        print(vendor_name,ceo_name,phone_number)
        
        excel_sheet.append([rank,name,vendor_name,ceo_name, phone_number,link])
    
excel_file.save('best100report.xlsx')
excel_file.close()

google sheet의 경우

https://console.developers.google.com/

google sheet API 및 google drive API에 대한 권한을 먼저 획득 한 후,

json 형태의 사용자 정보를 가져온다. (코드에서 해당 파일을 로딩함)

받아온 키를 열어보면 아래의 line이 존재하며,

해당 계정을 google drive 에서 editor 권한을 주어야만 사용이 가능하다

"client_email": "googlesheeteditor@first-trial-177512.iam.gserviceaccount.com",

#구글 sheet 사용법

import gspread
from oauth2client.service_account import ServiceAccountCredentials

scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('first-trial-5176235391ae.json',scope)
client = gspread.authorize(creds)


sheet = client.open('report').sheet1
#data = sheet.get_all_records()
#print(data)

string = "I'm inserting a new row into a spreadsheet using python"
row = string.split()
index = 3 
sheet.insert_row(row, index)

위의 크롤링 데이터를 아래와 같이 google sheet에 저장할 수 있다.


import requests
from bs4 import BeautifulSoup
import gspread
from oauth2client.service_account import ServiceAccountCredentials


#구글 시트 가져오기
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('first-trial-5176235391ae.json',scope)
client = gspread.authorize(creds)


worksheet = client.open('report').sheet1
worksheet.clear()
worksheet.insert_row(['rank','name','company','ceo','tel','link'],1)


#크롤링 정보 가져오기
site = 'http://corners.gmarket.co.kr/Bestsellers?viewType=G&groupCode=G06'
res = requests.get(site)

soup = BeautifulSoup(res.content, 'html.parser')
##gBestWrap > div > div:nth-child(6) > div:nth-child(3) > ul > li:nth-child(1) > a
#selector = '#gBestWrap > div > div:nth-child(6) > div > ul > li > a'
selector = 'div.best-list > ul > li'
data= soup.select(selector)

row=1
for item in data:
    
    rank = item.find('p').get_text()
    name = item.find('a','itemname').get_text()
    #태그의 속성은 dict처럼 접근 가능!
    link = item.find('a','itemname')['href']
    if rank != 'PLUS':
        
        #print(rank,name,link)        
        res_sub = requests.get(link)
        soup_sub = BeautifulSoup(res_sub.content, 'html.parser')
        selector_sub = 'dl.exchange_data.seller_data > dd'
        data_sub = soup_sub.select(selector_sub)
        #참고 : select_one을 사용할 경우, 리스트가 아닌 selector에 해당하는 단 한개의 데이터만 가져올 수 있다.
        #data_sub = soup_sub.select_one(selector_sub)
      
        vendor_name = data_sub[0].get_text()
        ceo_name = data_sub[1].get_text()
        phone_number = data_sub[2].get_text()
        row += 1
        print(vendor_name,ceo_name,phone_number)
        worksheet.insert_row([rank,name, vendor_name,ceo_name,phone_number,link],row)

'python' 카테고리의 다른 글

형변환 연습 (0)	2020.01.16
python 가상환경 설정 (0)	2020.01.02
정규표현식 연습 2. 반복 (0)	2019.12.26
정규표현식 연습 1. (0)	2019.12.26
자주 쓰는 string 함수 (0)	2019.12.26

Posted by easy16

easy blog

'python'에 해당되는 글 32건

Series와 DataFrame

'python' 카테고리의 다른 글

conda package 설치 및 삭제

'python' 카테고리의 다른 글

pip 속도가 느릴 때

'python' 카테고리의 다른 글

conda 가상환경(virtual environment)를 Pycharm에서 설정하는 방법

'python' 카테고리의 다른 글

변환, 정렬, packing unpacking 간단 예제

'python' 카테고리의 다른 글

코루틴 예제

'python' 카테고리의 다른 글

클래스 변수, 인스턴스 변수

'python' 카테고리의 다른 글

형변환 연습

'python' 카테고리의 다른 글

python 가상환경 설정

'python' 카테고리의 다른 글

크롤링 연습 11. google sheet에 데이터 저장

'python' 카테고리의 다른 글

카테고리

공지사항

태그목록

최근에 올라온 글

최근에 달린 댓글

글 보관함

달력

링크

티스토리툴바