《Using Databases with Python》Week3 Data Models and Relational SQL 课堂笔记

Coursera课程《Using Databases with Python》密歇根大学

Week3 Data Models and Relational SQL

15.4 Designing a Data Model

主要介绍了数据模型的重要性，以及数据模型构建的一些思考过程。

15.5 Representing a Data Model in Tables

概念模型

主键(Primary key)，指的是一个列或多列的组合，其值能唯一地标识表中的每一行，通过它可强制表的实体完整性。主键主要是用于其他表的外键关联，以及本记录的修改与删除。

外键(Foreign key)，作用是保持数据一致性，完整性，主要目的是控制存储在外键表中的数据。使两张表形成关联，外键只能引用外表中的列的值。

如果我们要构建上面概念模型所表示的数据库，那么我们用到的一些SQL语句有：

CREATE TABLE Genre(

      id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

      name TEXT

)

CREATE TABLE Album(

      id     INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

      artist_id INTEGER

      title     TEXT

)

CREATE TABLE Track(

       id     INTEGER NOT NULL PRIMARY KEY

              AUTOINCREMENT UNIQUE,

       title TEXT,

       album_id INTEGER,

       genre_id INTEGER,

       len INTEGER,

       rating INTEGER,

       count INTEGER

)

15.6 Inserting Relational Data

插入数据

insert into Artist(name) values ('Led Zepplin')

insert into Artist(name) values ('AC/DC')

像上面这样就往Artist表中加入了两行数据。

而对于Album表来说，它连接了Artist表，有两列数据要插入，那么这样。

insert into Album(title,artist_id) values ('Who Made Who',2)

insert into Album(title,artist_id) values ('IV',1)

所以这样之后，我们就建立起了数据之间的关系。

15.7 Reconstructing Data with JOIN

JOIN操作像是在几个表之间的SELECT操作。

而我们告诉JOIN怎么使用这些key则需要用到ON语句。有点像WHERE语句。

select Album.title, Artist.name from Album join Artist on Album.artist_id = Artist.id

如果把事情变复杂一些……

Work Example: Tracks.py

import xml.etree.ElementTree as ET

import sqlite3

conn = sqlite3.connect('trackdb.sqlite')

cur = conn.cursor()

# Make some fresh tables using executescript()

cur.executescript('''

DROP TABLE IF EXISTS Artist;

DROP TABLE IF EXISTS Album;

DROP TABLE IF EXISTS Track;

CREATE TABLE Artist (

    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

    name    TEXT UNIQUE

);

CREATE TABLE Album (

    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

    artist_id  INTEGER,

    title   TEXT UNIQUE

);

CREATE TABLE Track (

    id  INTEGER NOT NULL PRIMARY KEY

        AUTOINCREMENT UNIQUE,

    title TEXT  UNIQUE,

    album_id  INTEGER,

    len INTEGER, rating INTEGER, count INTEGER

);

''')

fname = input('Enter file name: ')

if ( len(fname) < 1 ) : fname = 'Library.xml'

# <key>Track ID</key><integer>369</integer>

# <key>Name</key><string>Another One Bites The Dust</string>

# <key>Artist</key><string>Queen</string>

def lookup(d, key):

    found = False

    for child in d:

        if found : return child.text

        if child.tag == 'key' and child.text == key :

            found = True

    return None

stuff = ET.parse(fname)

all = stuff.findall('dict/dict/dict')

print('Dict count:', len(all))

for entry in all:

    if ( lookup(entry, 'Track ID') is None ) : continue

    name = lookup(entry, 'Name')

    artist = lookup(entry, 'Artist')

    album = lookup(entry, 'Album')

    count = lookup(entry, 'Play Count')

    rating = lookup(entry, 'Rating')

    length = lookup(entry, 'Total Time')

    if name is None or artist is None or album is None :

        continue

    print(name, artist, album, count, rating, length)

    cur.execute('''INSERT OR IGNORE INTO Artist (name)

        VALUES ( ? )''', ( artist, ) )

    cur.execute('SELECT id FROM Artist WHERE name = ? ', (artist, ))

    artist_id = cur.fetchone()[0]

    cur.execute('''INSERT OR IGNORE INTO Album (title, artist_id)

        VALUES ( ?, ? )''', ( album, artist_id ) )

    cur.execute('SELECT id FROM Album WHERE title = ? ', (album, ))

    album_id = cur.fetchone()[0]

    cur.execute('''INSERT OR REPLACE INTO Track

        (title, album_id, len, rating, count)

        VALUES ( ?, ?, ?, ?, ? )''',

        ( name, album_id, length, rating, count ) )

    conn.commit()

使用python脚本建立数据库的过程，注意其中的关键字IGNORE，它的作用是如果当期数据存在，那就不插入，否则插入。在这个地方十分有用，因为索引不能随意变化。

作业代码

import xml.etree.ElementTree as ET

import sqlite3

conn = sqlite3.connect('trackdb.sqlite')

cur = conn.cursor()

# Make some fresh tables using executescript()

cur.executescript('''

DROP TABLE IF EXISTS Artist;

DROP TABLE IF EXISTS Album;

DROP TABLE IF EXISTS Track;

DROP TABLE IF EXISTS Genre;

CREATE TABLE Artist (

    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

    name    TEXT UNIQUE

);

CREATE TABLE Genre (

    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

    name    TEXT UNIQUE

);

CREATE TABLE Album (

    id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,

    artist_id  INTEGER,

    title   TEXT UNIQUE

);

CREATE TABLE Track (

    id  INTEGER NOT NULL PRIMARY KEY

        AUTOINCREMENT UNIQUE,

    title TEXT  UNIQUE,

    album_id  INTEGER,

    genre_id  INTEGER,

    len INTEGER, rating INTEGER, count INTEGER

);

''')

fname = input('Enter file name: ')

if ( len(fname) < 1 ) : fname = 'Library.xml'

# <key>Track ID</key><integer>369</integer>

# <key>Name</key><string>Another One Bites The Dust</string>

# <key>Artist</key><string>Queen</string>

def lookup(d, key):

    found = False

    for child in d:

        if found : return child.text

        if child.tag == 'key' and child.text == key :

            found = True

    return None

stuff = ET.parse(fname)

all = stuff.findall('dict/dict/dict')

print('Dict count:', len(all))

for entry in all:

    if ( lookup(entry, 'Track ID') is None ) : continue

    name = lookup(entry, 'Name')

    artist = lookup(entry, 'Artist')

    album = lookup(entry, 'Album')

    genre = lookup(entry,'Genre')

    count = lookup(entry, 'Play Count')

    rating = lookup(entry, 'Rating')

    length = lookup(entry, 'Total Time')

    if name is None or artist is None or album is None or genre is None:

        continue

    print(name, artist, album, genre, count, rating, length)

    cur.execute('''INSERT OR IGNORE INTO Artist (name)

        VALUES ( ? )''', ( artist, ) )

    cur.execute('SELECT id FROM Artist WHERE name = ? ', (artist, ))

    artist_id = cur.fetchone()[0]

    cur.execute('''INSERT OR IGNORE INTO Album (title, artist_id)

        VALUES ( ?, ? )''', ( album, artist_id ) )

    cur.execute('SELECT id FROM Album WHERE title = ? ', (album, ))

    album_id = cur.fetchone()[0]

    cur.execute('''INSERT OR IGNORE INTO Genre (name)

        VALUES ( ? )''', ( genre, ) )

    cur.execute('SELECT id FROM Genre WHERE name = ? ', (genre, ))

    genre_id = cur.fetchone()[0]

    cur.execute('''INSERT OR REPLACE INTO Track

        (title, album_id, genre_id, len, rating, count)

        VALUES ( ?, ?, ?, ?, ? ,?)''',

        ( name, album_id, genre_id, length, rating, count ) )

conn.commit()