#26850 closed defect (bug) (fixed)
Single quotes show up as apostrophes when appearing before numbers
Reported by: | yurivictor | Owned by: | wonderboymusic |
---|---|---|---|
Milestone: | 4.0 | Priority: | normal |
Severity: | minor | Version: | 3.9 |
Component: | Formatting | Keywords: | wptexturize has-patch |
Focuses: | Cc: |
Description
Here's an example:
Also happening on .org trunk and wordpress.com:
http://yurivictor.wordpress.com/2014/01/16/testings-4-through-quotes/
Headlines were manually typed into the WordPress admin, not copy and pasted.
Attachments (9)
Change History (34)
#3
@
11 years ago
- Keywords needs-unit-tests added
- Milestone changed from Awaiting Review to Future Release
- Summary changed from Single quotes show up as apostrophes when appearing before numbers in the_title() to Single quotes show up as apostrophes when appearing before numbers
This isn't the_title() only; it's wptexturize().
The rule is designed for years. So, '99
or '99's
gets rendered with an apostrophe preceeding the 99
. In this case, it isn't a year, it's only one digit. So that should handle at least 0-9 and 100+. What about 10-99? In this case, it's the start of a quote. If we can manage to process that there is a closing quote as well, versus just a standalone number, it's probably a better bet to assume that we're dealing with a quote rather than a year, which is much more rare.
This some needs unit tests. Also is possibly a duplicate of another wptexturize() ticket (most of which also need unit tests).
#4
@
11 years ago
'90's
is poor form anyway, if referring to years; should be '90s
. So, yes, assuming that the existence of a closing quote means it's a quote as opposed to a truncated number is a good idea.
#5
@
11 years ago
Made a dirty patch 26850.diff.
Forces regex to search for two digits in a row, rather than any digit.
e.g. Finds '99 as a year, but not '9.
The rest will go through the normal wptexturize flow, which appropriately styles the apostrophes. Doesn't solve all use cases, but it's a start.
#6
@
11 years ago
Good first step, I agree. We'll want some unit tests for this, to verify that it no longer messes with '9 or '999. (It looks like '999 will still fail here.)
I tend to agree with Helen, '99's is poor form. But there are some possessive considerations. "1999's introduction of the Euro" becomes "'99's introduction of the Euro". Not that a year possessing something is good form.
An aside, I enjoyed searching around for some writings about the direction and treatment of these apostrophes, and I was happy that the first three results I clicked were WordPress blogs that, in the text, had examples correct. Meanwhile, one post was a 500-word missive on how to get Microsoft Word to do this without screwing it up.
#7
@
11 years ago
- Keywords has-patch added; needs-patch removed
Latest patch 26850.2.diff fixes apostrophes for all numbers 0-9 and greater than 99.
So '9 and '999 will both show up correctly, while '99 will still convert to year.
Uses space or end of line to stop after two digits.
#8
@
11 years ago
A negative lookahead might be better than a positive one: /\'(\d\d)(?!\d)/
. Otherwise '99.
won't be caught. Or, in lieu of \z
and \s
, using a word boundary should be sufficient here.
#9
@
11 years ago
Good point. Updated to use a negative lookahead 26850.3.diff.
#10
@
11 years ago
I'm really confused by this ticket. We have an ambiguous case where the '\d
pattern could mean different things.
The solution seems to be a hack for the phrase '4 years, 3 months,' which would fail predictably if the phrase were '40 years, 3 months,'
How do we reconcile those patterns? What exactly is the expected output?
#11
@
10 years ago
- Keywords close added
I would like to close this as invalid or wontfix. We have an ambiguous case where a decision has already been made to favor abbreviated year numbers. There is nothing to fix here.
#12
@
10 years ago
@miqrogroove, I'll see if I can explain better.
Quote marks show up in the wrong direction when used around numbers that aren't years.
Here's an example post where this happens:
https://yurivictor.wordpress.com/2014/01/16/testings-4-through-quotes/
Notice the headline. 4 is not a year, but it is treated as a year which is why as you may see the quote mark points in the wrong direction. The patch would prevent that from happening in 99.9% of cases. It's not a complete solution, but it would definitely fix the above use case and the problem that The Washington Post was having.
If you have a better solution, I would love to hear it, but this is definitely a problem that needs to be solved. The answer isn't obvious, but should at least be discussed.
#13
@
10 years ago
Yeah I got that part. And as I pointed out above, a single apostrophe before a number is syntactically identical and ambiguous between the year abbreviation and the beginning of a quoted number. How do you propose to distinguish between dates and quotes?
#14
@
10 years ago
Right. It's a tough bug.
The code in core currently assumes all apostrophes followed by a number are dates and changes the apostrophe accordingly.
Case 1: `4 is styled like a year, but it's never a year
Case 2: `44 is styled like a year, which it might be a year or the start of a quote
Case 3: `444 is styled like a year, but it's never a year
The patch solves case 1 and case 3. I have no idea how to solve case 2.
If someone comes up with a solution for case 2, I'd love to hear it because smart quotes are kind of a nightmare on almost every platform.
Smart quotes make for strange code.
#15
@
10 years ago
Is the apos-before-digit pattern supposed to never match unless there are exactly 2 digits? The pattern isn't written that way, and it would be a trivial adjustment. If that's all we're talking about here, that can be fixed.
If we want to distinguis between quotes and apostrophes for the 2 digits, that's going to be a whole other can of worms.
#16
@
10 years ago
Regarding comments 3 & 4 above, here is why a closing quote is not algorithmically helpful:
Then she said, 'I went to school in '99 but dropped out.'
vs.
Back in '99 she said, '40 years ago I went to school but dropped out.'
Both sentences have closing quotes, but usage of abbreviations remains ambiguous in simple patterns.
#18
@
10 years ago
26850.6.diff would also fix a concern from ticket #8775. If we want this pattern to work, it needs to be at the top of the pattern list again.
#19
@
10 years ago
- Keywords needs-unit-tests removed
In miqro-26850.patch:
- Only place an apostrophe before a number when it has exactly two digits.
- Never match '99' with the single prime pattern.
- Always assume '99' is an abbreviated year at the end of a quotation.
- Both test cases in the ticket description are resolved.
- Appropriate unit tests added.
- Resolves the unit test broken in [28721] for #8775.
- Does not fix any part of #27426.
#23
@
10 years ago
- Owner set to wonderboymusic
- Resolution set to fixed
- Status changed from new to closed
In 28765:
#24
@
10 years ago
Single-quoted phrases beginning with exactly two digits will not be fixed under this ticket for version 4.0.
'33 people went there', she said.
This has been discussed above. Any further discussion, please open a new ticket. Thanks.
Hi,
Confirmed happening on fresh install of WP 3.8
This seems to be the behaviour of the_title() as the back end input box prints fine.
I've also tested it on a WP 3.5.1 installation.
Similar behaviour, except that with the 3.5.1, the first quote gets converted to apostrophe, the 2nd does not. The backend input box is also affected in the same way.