在找到一些有趣的来源后,我最终得到了这个解决方案(添加了一个额外的函数来处理几个荷兰语字符):
drop function if exists fn_remove_accents;
delimiter |
create function fn_remove_accents( textvalue varchar(20000) )
returns varchar(20000)
begin
set @textvalue = textvalue;
-- ACCENTS
set @withaccents = \'ŠšŽžÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝŸÞàáâãäåæçèéêëìíîïñòóôõöøùúûüýÿþƒ\';
set @withoutaccents = \'SsZzAAAAAAACEEEEIIIINOOOOOOUUUUYYBaaaaaaaceeeeiiiinoooooouuuuyybf\';
set @count = length(@withaccents);
while @count > 0 do
set @textvalue = replace(@textvalue, substring(@withaccents, @count, 1), substring(@withoutaccents, @count, 1));
set @count = @count - 1;
end while;
-- SPECIAL CHARS
set @special = \'!@#$%¨&*()_+=§¹²³£¢¬"`´{[^~}]<,>.:;?/°ºª+*|\\\\\'\'\';
set @count = length(@special);
while @count > 0 do
set @textvalue = replace(@textvalue, substring(@special, @count, 1), \'\');
set @count = @count - 1;
end while;
return @textvalue;
end
|
DROP FUNCTION IF EXISTS `slugify`;
DELIMITER ;;
CREATE DEFINER=`root`@`localhost` # I have no idea what this does
FUNCTION `slugify`(dirty_string varchar(200))
RETURNS varchar(200) CHARSET latin1
DETERMINISTIC
BEGIN
DECLARE x, y , z Int;
Declare temp_string, allowed_chars, new_string VarChar(200);
Declare is_allowed Bool;
Declare c, check_char VarChar(1);
set allowed_chars = "abcdefghijklmnopqrstuvwxyz0123456789-";
set temp_string = fn_remove_accents(LOWER(dirty_string));
Select temp_string Regexp(\'&\') Into x;
If x = 1 Then
Set temp_string = replace(temp_string, \'&\', \' and \');
End If;
Select temp_string Regexp(\'[^a-z0-9]+\') into x;
If x = 1 then
set z = 1;
While z <= Char_length(temp_string) Do
Set c = Substring(temp_string, z, 1);
Set is_allowed = False;
Set y = 1;
Inner_Check: While y <= Char_length(allowed_chars) Do
If (strCmp(ascii(Substring(allowed_chars,y,1)), Ascii(c)) = 0) Then
Set is_allowed = True;
Leave Inner_Check;
End If;
Set y = y + 1;
End While;
If is_allowed = False Then
Set temp_string = Replace(temp_string, c, \'-\');
End If;
set z = z + 1;
End While;
End If;
Select temp_string Regexp("^-|-$|\'") into x;
If x = 1 Then
Set temp_string = Replace(temp_string, "\'", \'\');
Set z = Char_length(temp_string);
Set y = Char_length(temp_string);
Dash_check: While z > 1 Do
If Strcmp(SubString(temp_string, -1, 1), \'-\') = 0 Then
Set temp_string = Substring(temp_string,1, y-1);
Set y = y - 1;
Else
Leave Dash_check;
End If;
Set z = z - 1;
End While;
End If;
Repeat
Select temp_string Regexp("--") into x;
If x = 1 Then
Set temp_string = Replace(temp_string, "--", "-");
End If;
Until x <> 1 End Repeat;
Return temp_string;
END;;
DELIMITER ;
UPDATE wp_posts SET post_name = slugify(post_title) WHERE post_name LIKE \'%copy%\' AND post_type = \'MY_POST_TYPE\';
解释查询:前两个函数直接取自源1和源2,还有一个小的编辑来处理大写字符。
最后,更新查询将匹配那些包含单词的重复帖子copy
在他们的slug中,但根据您的情况,您可能希望使此规则更加严格。我想对我的自定义帖子类型运行此查询,因此我在查询中添加了一个额外的帖子类型检查,这没什么大不了的。
匹配的帖子将根据标题(post\\u title)分配一个新生成的slug,或者您可以为slug生成指定其他来源。
资料来源:
[1] Replace non-latin characters selectively
[2] Regenerate a slug from a free-text name
[3] WebElaine\'s answer (on this page)