henniu

拒绝重复采集办法

  1. ssh登陆服务器

  2. 执行命令:

        su - postgres

        如果出现输入密码窗口, 输入postgres

        psql -U postgres -d 你的数据库名

        create unique index unique_index_articleno_chaptername ON t_chapter(articleno,chaptername);
        create unique index unique_index_articlename_author ON t_article(articlename,author);


如果安装过程是默认的, 你的数据库名就是yidu, 如果改了, 就到web目录下进入WEB-INF/classes, 在jdbc.properties中找到数据库名    jdbc.url=jdbc:postgresql://127.0.0.1:5432/你的数据库名


添加两个唯一索引之前必须确保数据库中不存在重复章节、重复小说


以下两条sql分别查询重复章节和重复小说


--查询重复章节
select articleno,chapterno from t_chapter where chapterno in (
    select max(chapterno) from t_chapter tc inner join (
        select articleno ,chaptername from t_chapter
        group by articleno,chaptername having count(1)>1
    ) tc1 on tc.chaptername = tc1.chaptername and tc.articleno = tc1.articleno
);

--查询重复小说
select articleno,articlename from t_article where articleno in (
    select max(articleno) from t_article tc inner join (
        select articlename from t_article
        group by articlename having count(1)>1
    ) tc1 on tc.articlename = tc1.articlename
);


查询结果正确的话, 将select xxxx from 修改为 delete from 重新执行即可。


#1楼
发帖时间:2014-10-22   |   查看数:0   |   回复数:5
永久禁区

删除重复章节
delete from t_chapter where chapterno in (
   select max(chapterno) from t_chapter tc inner join (
       select articleno ,chaptername from t_chapter
       group by articleno,chaptername having count(1)>1
   ) tc1 on tc.chaptername = tc1.chaptername and tc.articleno = tc1.articleno
);
删除重复小说
delete from t_article where articleno in (
   select max(articleno) from t_article tc inner join (
       select articlename from t_article
       group by articlename having count(1)>1
   ) tc1 on tc.articlename = tc1.articlename
);

删除指定小说全部章节

delete from t_chapter where articleno = 数字id;

2014-10-24 #2楼
胡思乱想
我能说完全看不懂么
2014-11-8 #3楼
弱水三千

--查询重复章节

select * from (

    select tc.articleno,tc.chapterno,row_number() over(partition by tc.chaptername order by tc.chapterno asc) mark

    from t_chapter tc inner join (

        select articleno ,chaptername from t_chapter

        group by articleno,chaptername having count(1)>1

    ) tc1 on tc.chaptername = tc1.chaptername and tc.articleno = tc1.articleno

) temp where mark<>1;

--查询重复小说

select * from (

    select tc.articleno,tc.articlename,row_number() over(partition by tc.articlename order by articleno asc) mark 

    from t_article tc inner join (

           select articlename from t_article

           group by articlename having count(1)>1

      ) tc1 on tc.articlename = tc1.articlename

) temp where mark<>1;


2014-12-28 #4楼
一梦若逝
如果要按章节的名称删除重复的相同名称的章节,怎么删?
2015-7-22 #5楼
Nodream
给大家发布一个非常重要的查询命令,一键批量删除重复章节的,代码如下,本人亲测可用:
delete  from t_chapter where chapterno in (select chapterno from (
   select tc.articleno,tc.chapterno,row_number() over(partition by tc.chaptername order by tc.chapterno asc) mark
   from t_chapter tc inner join (
       select articleno ,chaptername from t_chapter
       group by articleno,chaptername having count(1)>1
   ) tc1 on tc.chaptername = tc1.chaptername and tc.articleno = tc1.articleno
) temp where mark<>1) ;
2018-12-4 #6楼
游客组