Re: Re: [GENERAL] Sorting problems with SELECT * FROM

Lists: pgsql-general
From: Nico Grubert <nicogrubert(at)arcor(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'
Date: 2005-10-30 16:17:01
Message-ID: 4364F1FD.6040102@arcor.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi there,

I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE 'Ö%'

I am running Postgres 8.02 with a database whose character encoding is
UNICODE.

The SQL Query

SELECT *
FROM member
WHERE name LIKE 'O%'
OR
name like 'Ö%'
ORDER BY name

returns this:
Öhlmann
Öhmann
Obenaus
Ochoa
O'Donovan
Oehme
Oklant
Oltub
Oltüch
Oltutz
Oltüwer

According to german sorting rules the result is fine except the both
first entries "Öhlmann" and "Öhmann".
Why do appear these records at the beginning of the list?
The proper result should read like this:
Obenaus
Ochoa
O'Donovan
Oehme
Öhlmann
Öhmann
Oklant
Oltub
Oltüch
Oltutz
Oltüwer

The same problem accours when using "E" where my result is this:
Élie de Beaumont
Eberer
Ecü
Edding
Emmer

The proper result should be:
Eberer
Ecü
Edding
Élie de Beaumont
Emmer

Any idea how I can solve this problem?

Thank you very much in advance,
Nico


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Nico Grubert <nicogrubert(at)arcor(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'
Date: 2005-10-30 16:31:43
Message-ID: 7660.1130689903@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Nico Grubert <nicogrubert(at)arcor(dot)de> writes:
> I have a problem when sorting records with:
> SELECT * FROM table WHERE name LIKE '%'

> I am running Postgres 8.02 with a database whose character encoding is
> UNICODE.

... but what locale is it using? (See LC_COLLATE and LC_CTYPE.)

regards, tom lane


From: Nico Grubert <nicogrubert(at)arcor(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'
Date: 2005-10-30 16:37:54
Message-ID: 4364F6E2.5080400@arcor.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> ... but what locale is it using? (See LC_COLLATE and LC_CTYPE.)

Can I find out out these settings in "phpPgAdmin"?
Or can I use LC_COLLATE and LC_CTYPE in the SQL Query?


From: Nico Grubert <nicogrubert(at)arcor(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'
Date: 2005-10-30 16:39:49
Message-ID: 4364F755.1020605@arcor.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Ah, I found it:

lc_collate: de_DE(at)euro
lc_ctype: de_DE(at)euro


From: Nico Grubert <nicogrubert(at)arcor(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'
Date: 2005-10-31 17:39:40
Message-ID: 436656DC.2040803@arcor.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> I have a problem when sorting records with:
> SELECT * FROM table WHERE name LIKE 'Ö%'
>
> I am running Postgres 8.02 with a database whose character encoding is
> UNICODE.
>
> The SQL Query
>
> SELECT *
> FROM member
> WHERE name LIKE 'O%'
> OR
> name like 'Ö%'
> ORDER BY name
>
>
> returns this:
> Öhlmann
> Öhmann
> Obenaus
> Ochoa
> O'Donovan
> Oehme
> Oklant
> Oltub
> Oltüch
> Oltutz
> Oltüwer
>
> According to german sorting rules the result is fine except the both
> first entries "Öhlmann" and "Öhmann".
> Why do appear these records at the beginning of the list?
> The proper result should read like this:
> Obenaus
> Ochoa
> O'Donovan
> Oehme
> Öhlmann
> Öhmann
> Oklant
> Oltub
> Oltüch
> Oltutz
> Oltüwer
>
>
>
> The same problem accours when using "E" where my result is this:
> Élie de Beaumont
> Eberer
> Ecü
> Edding
> Emmer
>
> The proper result should be:
> Eberer
> Ecü
> Edding
> Élie de Beaumont
> Emmer
>
>
> Any idea how I can solve this problem?
>
>
> Thank you very much in advance,
> Nico

To complete the missing information, here are the variables set for the
databases:
add_missing_from on
archive_command unset
australian_timezones off
authentication_timeout 60
bgwriter_delay 200
bgwriter_maxpages 100
bgwriter_percent 1
block_size 8192
check_function_bodies on
checkpoint_segments 3
checkpoint_timeout 300
checkpoint_warning 30
client_encoding UNICODE
client_min_messages notice
commit_delay 0
commit_siblings 5
cpu_index_tuple_cost 0.001
cpu_operator_cost 0.0025
cpu_tuple_cost 0.01
custom_variable_classes unset
DateStyle ISO, MDY
db_user_namespace off
deadlock_timeout 1000
debug_pretty_print off
debug_print_parse off
debug_print_plan off
debug_print_rewritten off
debug_shared_buffers 0
default_statistics_target 10
default_tablespace unset
default_transaction_isolation read committed
default_transaction_read_only off
default_with_oids on
effective_cache_size 1000
enable_hashagg on
enable_hashjoin on
enable_indexscan on
enable_mergejoin on
enable_nestloop on
enable_seqscan on
enable_sort on
enable_tidscan on
explain_pretty_print on
extra_float_digits 0
from_collapse_limit 8
fsync on
geqo on
geqo_effort 5
geqo_generations 0
geqo_pool_size 0
geqo_selection_bias 2
geqo_threshold 12
integer_datetimes on
join_collapse_limit 8
lc_collate de_DE(at)euro
lc_ctype de_DE(at)euro
lc_messages de_DE(at)euro
lc_monetary de_DE(at)euro
lc_numeric de_DE(at)euro
lc_time de_DE(at)euro
listen_addresses localhost
log_connections off
log_destination stderr
log_disconnections off
log_duration off
log_error_verbosity default
log_executor_stats off
log_hostname off
log_line_prefix unset
log_min_duration_statement -1
log_min_error_statement panic
log_min_messages notice
log_parser_stats off
log_planner_stats off
log_rotation_age 1440
log_rotation_size 10240
log_statement none
log_statement_stats off
log_truncate_on_rotation off
maintenance_work_mem 16384
max_connections 100
max_files_per_process 1000
max_fsm_pages 20000
max_fsm_relations 1000
max_function_args 32
max_identifier_length 63
max_index_keys 32
max_locks_per_transaction 64
max_stack_depth 2048
password_encryption on
port 5432
pre_auth_delay 0
random_page_cost 4
redirect_stderr off
regex_flavor advanced
rendezvous_name unset
search_path $user,public
server_encoding UNICODE
server_version 8.0.2
shared_buffers 1000
silent_mode off
sql_inheritance on
ssl off
statement_timeout 0
stats_block_level off
stats_command_string off
stats_reset_on_server_start on
stats_row_level off
stats_start_collector on
superuser_reserved_connections 2
syslog_facility LOCAL0
syslog_ident postgres
TimeZone Europe/Berlin
trace_notify off
transaction_isolation read committed
transaction_read_only off
transform_null_equals off
unix_socket_group unset
unix_socket_permissions 511
vacuum_cost_delay 0
vacuum_cost_limit 200
vacuum_cost_page_dirty 20
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
wal_buffers 8
wal_sync_method fdatasync
work_mem 1024
zero_damaged_pages off

Is there any explaination why the result is not sorted properly?


From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Re: [GENERAL] Sorting problems with SELECT * FROM
Date: 2005-10-31 20:34:40
Message-ID: 20051031212937.8645272@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Nico Grubert wrote:

> Ah, I found it:
>
> lc_collate: de_DE(at)euro
> lc_ctype: de_DE(at)euro

This is an iso-8859-15 locale, isn't it?
If your database encoding is UNICODE, I believe you'd have more success
using an UTF8 locale, such as de_DE.UTF-8 in your case.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org