Python-Markdown 3.0 Release Notes¶
We are pleased to release Python-Markdown 3.0 which adds a few new features and fixes various bugs and deprecates various old features. See the list of changes below for details.
Python-Markdown version 3.0 supports Python versions 2.7, 3.4, 3.5, 3.6, 3.7, PyPy and PyPy3.
Backwards-incompatible changes¶
enable_attributes
keyword deprecated¶
The enable_attributes
keyword is deprecated in version 3.0 and will be
ignored. Previously the keyword was True
by default and enabled an
undocumented way to define attributes on document elements. The feature has been
removed from version 3.0. As most users did not use the undocumented feature, it
should not affect most users. For the few who did use the feature, it can be
enabled by using the Legacy Attributes
extension.
smart_emphasis
keyword and smart_strong
extension deprecated¶
The smart_emphasis
keyword is deprecated in version 3.0 and will be ignored.
Previously the keyword was True
by default and caused the parser to ignore
middle-word emphasis. Additionally, the optional smart_strong
extension
provided the same behavior for strong emphasis. Both of those features are now
part of the default behavior, and the Legacy
Emphasis extension is available to disable that
behavior.
output_formats
simplified to html
and xhtml
.¶
The output_formats
keyword now only accepts two options: html
and xhtml
Note that if (x)html1
, (x)html4
or (x)html5
are passed in, the number is
stripped and ignored.
safe_mode
and html_replacement_text
keywords deprecated¶
Both safe_mode
and the associated html_replacement_text
keywords are
deprecated in version 3.0 and will be ignored. The so-called “safe mode” was
never actually “safe” which has resulted in many people having a false sense of
security when using it. As an alternative, the developers of Python-Markdown
recommend that any untrusted content be passed through an HTML sanitizer (like
Bleach) after being converted to HTML by markdown. In fact, Bleach Whitelist
provides a curated list of tags, attributes, and styles suitable for filtering
user-provided HTML using bleach.
If your code previously looked like this:
html = markdown.markdown(text, safe_mode=True)
Then it is recommended that you change your code to read something like this:
import bleach
from bleach_whitelist import markdown_tags, markdown_attrs
html = bleach.clean(markdown.markdown(text), markdown_tags, markdown_attrs)
If you are not interested in sanitizing untrusted text, but simply desire to escape raw HTML, then that can be accomplished through an extension which removes HTML parsing:
from markdown.extensions import Extension
class EscapeHtml(Extension):
def extendMarkdown(self, md):
md.preprocessors.deregister('html_block')
md.inlinePatterns.deregister('html')
html = markdown.markdown(text, extensions=[EscapeHtml()])
As the HTML would not be parsed with the above Extension, then the serializer
will escape the raw HTML, which is exactly what happened in previous versions
with safe_mode="escape"
.
Positional arguments deprecated¶
Positional arguments on the markdown.Markdown()
class are deprecated as are
all except the text
argument on the markdown.markdown()
wrapper function.
Using positional arguments will raise an error. Only keyword arguments should be
used. For example, if your code previously looked like this:
html = markdown.markdown(text, [SomeExtension()])
Then it is recommended that you change it to read something like this:
html = markdown.markdown(text, extensions=[SomeExtension()])
Note
This change is being made as a result of deprecating "safe_mode"
as the
safe_mode
argument was one of the positional arguments. When that argument
is removed, the two arguments following it will no longer be at the correct
position. It is recommended that you always use keywords when they are
supported for this reason.
Extension name behavior has changed¶
In previous versions of Python-Markdown, the built-in extensions received
special status and did not require the full path to be provided. Additionally,
third party extensions whose name started with "mdx_"
received the same
special treatment. This is no longer the case.
Support has been added for extensions to define an entry
point. An entry point is a string name which
can be used to point to an Extension
class. The built-in extensions now have
entry points which match the old short names. And any third-party extensions
which define entry points can now get the same behavior. See the documentation
for each specific extension to find the assigned name.
If an extension does not define an entry point, then the full path to the extension must be used. See the documentation for a full explanation of the current behavior.
Extension configuration as part of extension name deprecated¶
The previously documented method of appending the extension configuration
options as a string to the extension name is deprecated and will raise an error.
The extension_configs
keyword should be
used instead. See the documentation for a
full explanation of the current behavior.
HeaderId extension deprecated¶
The HeaderId Extension is deprecated and will raise an error if specified. Use the Table of Contents Extension instead, which offers most of the features of the HeaderId Extension and more (support for meta data is missing).
Extension authors who have been using the slugify
and unique
functions
defined in the HeaderId Extension should note that those functions are now
defined in the Table of Contents extension and should adjust their import
statements accordingly (from markdown.extensions.toc import slugify, unique
).
Homegrown OrderedDict
has been replaced with a purpose-built Registry
¶
All processors and patterns now get “registered” to a
Registry. A backwards compatible shim is
included so that existing simple extensions should continue to work.
A DeprecationWarning
will be raised for any code which calls the old API.
Markdown class instance references.¶
Previously, instances of the Markdown
class were represented as any one of
md
, md_instance
, or markdown
. This inconsistency made it difficult when
developing extensions, or just maintaining the existing code. Now, all instances
are consistently represented as md
.
The old attributes on class instances still exist, but raise a
DeprecationWarning
when accessed. Also on classes where the instance was
optional, the attribute always exists now and is simply None
if no instance
was provided (previously the attribute would not exist).
markdown.util.isBlockLevel
deprecated¶
The markdown.util.isBlockLevel
function is deprecated and will raise a
DeprecationWarning
. Instead, extensions should use the isBlockLevel
method
of the Markdown
class instance. Additionally, a list of block level elements
is defined in the block_level_elements
attribute of the Markdown
class which
extensions can access to alter the list of elements which are treated as block
level elements.
md_globals
keyword deprecated from extension API¶
Previously, the extendMarkdown
method of a markdown.extensions.Extension
subclasses accepted an md_globals
keyword, which contained the value returned
by Python’s globals()
built-in function. As all of the configuration is now
held within the Markdown
class instance, access to the globals is no longer
necessary and any extensions which expect the keyword will raise a
DeprecationWarning
. A future release will raise an error.
markdown.version
and markdown.version_info
deprecated¶
Historically, version numbers were acquired via the attributes
markdown.version
and markdown.version_info
. Moving forward, a more
standardized approach is being followed and versions are acquired via the
markdown.__version__
and markdown.__version_info__
attributes. The legacy
attributes are still available to allow distinguishing versions between the
legacy Markdown 2.0 series and the Markdown 3.0 series, but in the future the
legacy attributes will be removed.
Added new, more flexible InlineProcessor
class¶
A new InlineProcessor
class handles inline processing much better and allows
for more flexibility. The new InlineProcessor
classes no longer utilize
unnecessary pretext and post-text captures. New class can accept the buffer that
is being worked on and manually process the text without regular expressions and
return new replacement bounds. This helps us to handle links in a better way and
handle nested brackets and logic that is too much for regular expression.
New features¶
The following new features have been included in the release:
-
A new testing framework is included as a part of the Markdown library, which can also be used by third party extensions.
-
A new
toc_depth
parameter has been added to the Table of Contents Extension. -
A new
toc_tokens
attribute has been added to the Markdown class by the Table of Contents Extension, which contains the raw tokens used to build the Table of Contents. Users can use this to build their own custom Table of Contents rather than needing to parse the HTML available on thetoc
attribute of the Markdown class. -
When the Table of Contents Extension is used in conjunction with the Attribute Lists Extension and a
data-toc-label
attribute is defined on a header, the content of thedata-toc-label
attribute is now used as the content of the Table of Contents item for that header. -
Additional CSS class names can be appended to Admonitions.