Skip site navigation (1) Skip section navigation (2)

Peripheral Links

Header And Logo

PostgreSQL
| The world's most advanced open source database.

Site Navigation

Search archives
  Advanced Search

Bug in count(*)/Group By.


  • From: Nick Lawes <nick(at)grok(dot)demon(dot)co(dot)uk>
  • Subject: Bug in count(*)/Group By.
  • Date: Sun, 11 Jan 1998 19:26:17 +0000 (GMT)

Your name		:	Nick Lawes
Your email address	:	nick(at)grok(dot)demon(dot)co(dot)uk


System Configuration
- ---------------------
  Architecture (example: Intel Pentium)  	:   Intel Pentium Pro 233

  Operating System (example: Linux 2.0.26 ELF) 	:   Linux 2.0.33 ELF

  PostgreSQL version (example: PostgreSQL-6.2.1):   PostgreSQL-6.2.1 p6

  Compiler used (example:  gcc 2.7.2)		:   gcc 2.7.2.1


Please enter a FULL description of your problem:
- ------------------------------------------------
I am building a record database to track singles, albums and CD's. Having
put in about 13000 tracks I decided to count the number of tracks recorded
by each artist... The command I used was

	SELECT count(*),artist
	FROM tracks
	GROUP BY artist

I expected this to work, but found duplicate entries for many of the
artists. For example, artists beginning with V...


count|artist                                                     
- -----+-----------------------------------------------------------
	....
    2|VALENS, RICKY                                              
   12|VALLI, FRANKIE                                             
    2|VALLI, FRANKIE AND THE FOUR SEASONS                        
    2|VAN, ILA                                                   
    1|VANDROSS, LUTHER                                           
    2|VANGELIS                                                   
    2|VANITY FARE                                                
*   1|VANWARMER, RANDY                                           
*   1|VANWARMER, RANDY                                           
    5|VAPOURS                                                    
    6|VAUGHN, FRANKIE                                            
*   1|VAUGHN, SARAH AND BILLY ECKSTINE                           
*   1|VAUGHN, SARAH AND BILLY ECKSTINE                           
   14|VEE, BOBBY                                                 
    2|VEGA, TATA                                                 
    2|VENTURES                                                   
    4|VIBRATORS                                                  
*   1|VIENNA PHILHARMONIC ORCHESTRA                              
*   1|VIENNA PHILHARMONIC ORCHESTRA                              
    8|VILLAGE PEOPLE                                             
    2|VINCENT, GENE                                              
    6|VINTON, BOBBY                                              
    2|VIOLINSKI                                                  
    6|VISAGE                                                     
    6|VOYAGE                                                     
    4|VOYAGER                                                    
    2|VRETHAMMAR, SYLVIA                                         
	....
(3041 rows)

If I repeat the command, limiting the selection to just the artists
beginning with V:

	SELECT count(*),artist
	FROM tracks
	WHERE artist LIKE 'V%'
	GROUP BY artist

I get the correct answer

count|artist                             
- -----+-----------------------------------
    2|VALENS, RICKY                      
   12|VALLI, FRANKIE                     
    2|VALLI, FRANKIE AND THE FOUR SEASONS
    2|VAN, ILA                           
    1|VANDROSS, LUTHER                   
    2|VANGELIS                           
    2|VANITY FARE                        
    2|VANWARMER, RANDY                   
    5|VAPOURS                            
    6|VAUGHN, FRANKIE                    
    2|VAUGHN, SARAH AND BILLY ECKSTINE   
   14|VEE, BOBBY                         
    2|VEGA, TATA                         
    2|VENTURES                           
    4|VIBRATORS                          
    2|VIENNA PHILHARMONIC ORCHESTRA      
    8|VILLAGE PEOPLE                     
    2|VINCENT, GENE                      
    6|VINTON, BOBBY                      
    2|VIOLINSKI                          
    6|VISAGE                             
    6|VOYAGE                             
    4|VOYAGER                            
    2|VRETHAMMAR, SYLVIA                 
(24 rows)

There are MANY examples of duplicates throughout the data. A few tests
limiting the range of the select suggest that working with 600-700
records present no problem, but past that, these errors creep in.


Please describe a way to repeat the problem.   Please try to provide a
concise reproducible example, if at all possible: 
- ----------------------------------------------------------------------
I can provide a database of the artists (nearly 13000 of them) that
display the problems. This is 750K text (about 25K gzipped :-). I can
post this, or upload it somewhere on request.



If you know how this problem might be fixed, list the solution below:
- ---------------------------------------------------------------------



- -- 
Nick Lawes.             | Tel: +44 1322 522099  | Fax: +44 1322 555530
BRB Associates Limited. | Tel: +44 171 626 3683 | Fax: +44 171 410 0837
Internet: nick(at)grok(dot)demon(dot)co(dot)uk [158.152.16.66] | (space for rent :-)
amprnet : nick(at)gb7zhr(dot)ampr(dot)org   [44.131.248.1] | BBS : g8zhr(at)gb7hsn



Home | Main Index | Thread Index

Privacy Policy | About PostgreSQL
Copyright © 1996 – 2012 PostgreSQL Global Development Group