SQL 找出栏位值重覆的记录及删除重复记录
使用 SQL 的 GROUP BY 找出资料表中栏位值重覆的记录,将资料表按照分组,然后计算每个分组的记录数。
set myConnection = Server.CreateObject("ADODB.Connection")
Provider="Microsoft.Jet.OLEDB.4.0;Data Source=" & Server.MapPath("db/Northwind.mdb")
Store_Information 资料表内容
栏位排列 ORDER BY 以 S_Index 的顺序产生资料内容。
SELECT * FROM Store_Information ORDER BY S_Index
Do while NOT rs.EOF
Response.Write rs("S_Index") & rs("S_Name") & rs("S_Qty") & rs("S_Date")
rs.MoveNext
loop
索引键S_Index | 仓库名称S_Name | 库存量S_Qty | 单据日期S_Date |
---|---|---|---|
1 | Los Angeles | 900 | 2020/03/05 |
2 | San Diego | 1500 | 2020/03/07 |
3 | Los Angeles | 300 | 2020/03/12 |
4 | Boston | 1200 | 2020/03/08 |
5 | Boston | 1200 | 2020/03/08 |
6 | New Jersey | 700 | 2020/03/18 |
7 | Taiwan | 850 | 2020/09/08 |
8 | Boston | 200 | 2020/03/10 |
DISTINCT 找出所有不同的资料值 (无重复记录)
在 SELECT 查询语句中使用 DISTINCT 关键字过滤重复出现的记录值,栏位内找出所有不同的记录。
SELECT DISTINCT Store_name FROM Store_Information
S_Name |
---|
Boston |
Los Angeles |
New Jersey |
San Diego |
Taiwan |
找出栏位值重覆的记录
GROUP BY 计算出相符的资料值
某情况下 ACCESS 资料表的部份栏位可能会有相同的值,匯总数据之聚合函数与分组 GROUP BY 叙述句搭配聚合函数 Aggregate Function 使用,将查询结果中特定栏位值相同的资料分为若干个群组,而每一个群组都会传回一个资料列。若没有使用 GROUP BY 聚合函数针对一个 SELECT 查询,只会返回一个汇总值。
1 找出仓库名称及日期均相同的记录(两个栏位均相同)
GROUP BY 计算出相同的 S_Name 及 S_Date 资料记录,条件为仓库名称及日期均相同。
并且由变数 SelectIndex 保留列出结果的索引键。
SELECT MAX(S_Index), S_Name, S_Date
FROM Store_Information
GROUP BY S_Name, S_Date HAVING (COUNT(*) > 0)
Dim SelectIndex
Do while NOT rs.EOF
Response.Write rs(0) & rs("S_Name")
SelectIndex = SelectIndex & rs(0) & ","
rs.MoveNext
loop
S_Index rs(0) | S_Name |
---|---|
5 | Boston |
8 | Boston |
1 | Los Angeles |
3 | Los Angeles |
6 | New Jersey |
2 | San Diego |
7 | Taiwan |
由列出的记录中得到索引键 5, 8, 1, 3, 6, 2, 7,
IN 搭配 WHERE 子句可以用来限定必需符合某些栏位值为条件,来搜寻资料表中的特定(已知)资料记录。
相反的 NOT IN 就是不包含在这范围里的的资料记录。
SELECT * FROM Store_Information
WHERE S_Index NOT IN ( & SelectIndex & ) ORDER BY S_Index
Do while NOT rs.EOF
Response.Write rs("S_Index") & rs("S_Name")
rs.MoveNext
loop
使用 NOT IN 相反的列出不包含在这范围里的。只有索引键 4 的 Boston 栏位值重覆的记录。
S_Index | S_Name |
---|---|
4 | Boston |
2 找出仅仓库名称相同的记录
GROUP BY 计算出相同的资料值 S_Name 为仓库名称相同。
SELECT MAX(S_Index), S_Name
FROM Store_Information
GROUP BY S_Name HAVING (COUNT(*) > 0)
Dim SelectIndex
Do while NOT rs.EOF
Response.Write rs(0) & rs("S_Name")
SelectIndex = SelectIndex & rs(0) & ","
rs.MoveNext
loop
S_Index rs(0) | S_Name |
---|---|
8 | Boston |
3 | Los Angeles |
6 | New Jersey |
2 | San Diego |
7 | Taiwan |
列出的记录中得到索引键 8, 3, 6, 2, 7,
SELECT * FROM Store_Information
WHERE S_Index NOT IN ( & SelectIndex & ) ORDER BY S_Index
Do while NOT rs.EOF
Response.Write rs("S_Index") & rs("S_Name")
rs.MoveNext
loop
使用 NOT IN 相反的列出结果,不包含在这范围里的有三笔,仓库名称相同栏位值重覆的记录。
S_Index | S_Name |
---|---|
1 | Los Angeles |
4 | Boston |
5 | Boston |
删除重复记录
同样使用 NOT IN 删除结果。
DELETE FROM Store_Information
WHERE S_Index NOT IN ( & SelectIndex & )
Aggregate Function 聚合函数
Aggregate Function 聚合函数,指的也就是 AVG()、COUNT()、MAX()、MIN()、SUM() 等这些内建函数。
SUM
统计仓库名称相同的库存量
SELECT S_Name, SUM(S_Qty)
FROM Store_Information GROUP BY S_Name
S_Name | S_Qty rs(1) |
---|---|
Boston | 2600 |
Los Angeles | 1200 |
New Jersey | 700 |
San Diego | 1500 |
Taiwan | 850 |
加入 WHERE 条件,统计库存量小于 1000 的仓库记录。
SELECT S_Name, SUM(S_Qty)
FROM Store_Information
WHERE S_Qty < 1000 GROUP BY S_Name
S_Name | S_Index rs(0) |
---|---|
Boston | 200 |
Los Angeles | 1200 |
New Jersey | 700 |
Taiwan | 850 |
统计仓库名称相同的库存量,且条件库存量需大于 1000 的仓库记录。
HAVING 子句是用来取代 WHERE 搭配聚合函数 Aggregate Function 进行条件查询,因为 WHERE 不能与聚合函数一起使用。
SQL 提供 HAVING 的指令,筛选一或多个栏位名称,可以用这个指令来条件筛选。HAVING 子句通常是在 SQL 句子的最后。
SELECT S_Name, SUM(S_Qty)
FROM Store_Information
GROUP BY S_Name HAVING (SUM(S_Qty) > 1000)
S_Name | S_Qty rs(1) |
---|---|
Boston | 2600 |
Los Angeles | 1200 |
San Diego | 1500 |
统计仓库名称相同的库存量,且条件库存量需小于 1000 的仓库记录。
SELECT S_Name, SUM(S_Qty)
FROM Store_Information
GROUP BY S_Name HAVING (SUM(S_Qty) < 1000)
S_Name | S_Qty rs(1) |
---|---|
New Jersey | 700 |
Taiwan | 850 |
AVG 平均值
统计仓库名称相同的库存量平均值 Average、Aggregate。
SELECT S_Name, AVG(S_Qty)
FROM Store_Information GROUP BY S_Name
S_Name | S_Qty rs(1) |
---|---|
Boston | 866.666666666667 |
Los Angeles | 600 |
New Jersey | 700 |
San Diego | 1500 |
Taiwan | 850 |
例如 Boston = 1200 + 1200 + 200 = 2600 / 3 = 866.666666666667
SQL BETWEEN 包含在二个值之间的记录资料
列出库存量在于 500 与 1000 之间的记录。
SELECT * FROM Store_Information
WHERE S_Qty BETWEEN 500 AND 1000
S_Index | S_Name | S_Qty rs(1) | S_Date |
---|---|---|---|
1 | Los Angeles | 900 | 2020/03/05 |
6 | New Jersey | 700 | 2020/03/18 |
7 | Taiwan | 850 | 2020/09/08 |
列出仓库名称在于 'M' 与 'Z' 之间的记录。
SELECT * FROM Store_Information
WHERE S_Qty BETWEEN 'M' AND 'Z'
S_Index | S_Name | S_Qty rs(1) | S_Date |
---|---|---|---|
2 | San Diego | 1500 | 2020/03/07 |
6 | New Jersey | 700 | 2020/03/18 |
7 | Taiwan | 850 | 2020/09/08 |
NOT BWTWEEN
BETWEEN 相反的 NOT BWTWEEN 就是不包含在这范围里的的资料。
SELECT * FROM Store_Information
WHERE S_Qty NOT BETWEEN 'M' AND 'Z'
S_Index | S_Name | S_Qty rs(1) | S_Date |
---|---|---|---|
8 | Boston | 200 | 2020/03/10 |
1 | Los Angeles | 900 | 2020/03/05 |
3 | Los Angeles | 300 | 2020/03/12 |
4 | Boston | 1200 | 2020/03/08 |
5 | Boston | 1200 | 2020/03/08 |